Pre-indexing Techniques in Arabic Information Retrieval
Souheila Ben Guirat, Ibrahim Bounhas, Yahia Slimani
2019
Abstract
Arabic document indexing is yet challenging given the morphological specificities of this language. Although there has been much effort in the field, developing more efficient indexing approaches is more and more demanding. One of the most important issues concerns the choice of the indexing units (e.g. stems, roots, lemmas, etc.) which both enhances retrieval efficiency and optimizes the indexing process. The question is how to process Arabic texts to retrieve the basic forms which better reflect the meaning of words and documents? In the literature several indexing units have been compared, while combining multiple indexes seems to be promising. In our previous works, we showed that hybrid indexes based on stems, patterns and roots enhances results. However, we need to find the optimal weight of each indexing unit. Therefore, this paper proposes to contribute in optimizing hybrid indexing. We compare and evaluate four pre-indexing methods.
DownloadPaper Citation
in Harvard Style
Ben Guirat S., Bounhas I. and Slimani Y. (2019). Pre-indexing Techniques in Arabic Information Retrieval.In Proceedings of the 11th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART, ISBN 978-989-758-350-6, pages 237-246. DOI: 10.5220/0007393402370246
in Bibtex Style
@conference{icaart19,
author={Souheila Ben Guirat and Ibrahim Bounhas and Yahia Slimani},
title={Pre-indexing Techniques in Arabic Information Retrieval},
booktitle={Proceedings of the 11th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,},
year={2019},
pages={237-246},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0007393402370246},
isbn={978-989-758-350-6},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 11th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,
TI - Pre-indexing Techniques in Arabic Information Retrieval
SN - 978-989-758-350-6
AU - Ben Guirat S.
AU - Bounhas I.
AU - Slimani Y.
PY - 2019
SP - 237
EP - 246
DO - 10.5220/0007393402370246