Laroussi Merhbene, Anis Zouaghi, Mounir Zrigui



In this paper we propose an hybrid system of Arabic words ‎disambiguation. To achieve this goal we use the methods ‎employed in the domain of information retrieval: Latent ‎semantic analysis, Harman, Croft, Okapi, combined to the lesk ‎algorithm. These methods are used to estimate the most relevant ‎sense of the ambiguous word. This estimation is based on the ‎calculation of the proximity between the current context ‎‎(Context of the ambiguous word), and the different contexts of ‎use of each meaning of the word. The Lesk algorithm is used to ‎assign the correct sense of those proposed by the LSA, ‎Harman, Croft and Okapi. The results found by the proposed ‎system are satisfactory, we obtained a rate of disambiguation ‎equal to 76%. ‎


  1. Al-Shalabi, R., Kanaan, G., and Al-Serhan, H., 2003. New approach for extractingArabic roots. Paper presented at the International Arab Conference on Information Technology (ACIT'2003), Egypt.
  2. Black, W. J. and Elkateb, S., 2004. A Prototype EnglishArabic Dictionary Based on WordNet, Proceedings of 2nd Global WordNet Conference, GWC2004, Czech Republic: 67-74
  3. Croft, W., 1983. Experiments with representation in a document retrieval system; Research and development, 2(1); pp. 1-21.
  4. De Loupy, 2000. Assessing the contribution of linguistic knowledge in semantic disambiguation and information retrieval. THESIS presented in the University of Avignon and the country of Vaucluse.
  5. Derwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K. and Harshmann, R., 1990. Indexing by Latent Semantic Analysis. Journal of the American Society for Informartion Science, pp. 41: 391-407.
  6. Harman, D., 1986. An experimental study of factors important in document ranking; Actes de ACM Conference on Research and Development in Information Retrieval ; Pise, Italie .
  7. Ide, N. and Verronis, J., 1998. Word Sense Disambiguation: The State Of the Art. Computational Linguistics, pp. 2424:1, 1-40.
  8. Karov, Y. and Shimon, E., 1998. Similarity-based word sense disambiguation. In this issue.
  9. Lesk, M., 1986. Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone , ACM Special Interest Group for Design of Communication Proceedings of the 5th annual international conference on Systems documentation; pp. 24 - 26. ISBN 0897912241.
  10. Robertson, S., Walker, M., Hancock-Beaulieu and Gatford, M., 1994. Okapi at TREC-3 ; Third Text Retrieval Conference (TREC-3), NIST special publication 500-225; pp. 109-126; Gaithersburg, Maryland, USA.
  11. Salton, G. and Buckley, C., 1988. Term-weighting approaches in automatic text retrieval. Information Processing and Management, 24(5), pp. 513-523.
  12. Sawalha and al., 2008. Comparative Evaluation of Arabic Language Morphological Analysers and Stemmers. Coling 2008: Companion volume - Posters and Demonstrations, pages 107-110, Manchester, August 2008.
  13. Vasilescu, F., 2003. Monolingual corpus disambiguation by the approaches of Lesk : University of Montreal, Faculty of Arts and Sciences; Paper presented at the Faculty of Graduate Studies to obtain the rank of Master of Science (MSc) in computer science.
  14. Zouaghi A., Zrigui M. and Antoniadis G., 2008. Understanding of the Arabic spontaneous speech: A numeric modelisation, Revue TAL VARIA.

Paper Citation

in Harvard Style

Merhbene L., Zouaghi A. and Zrigui M. (2010). ARABIC WORD SENSE DISAMBIGUATION . In Proceedings of the 2nd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART, ISBN 978-989-674-021-4, pages 652-655. DOI: 10.5220/0002762106520655

in Bibtex Style

author={Laroussi Merhbene and Anis Zouaghi and Mounir Zrigui},
booktitle={Proceedings of the 2nd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,},

in EndNote Style

JO - Proceedings of the 2nd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,
SN - 978-989-674-021-4
AU - Merhbene L.
AU - Zouaghi A.
AU - Zrigui M.
PY - 2010
SP - 652
EP - 655
DO - 10.5220/0002762106520655