Word Sense Disambiguation of Persian Homographs
F. Jani, A.H. Pilevar
2012
Abstract
This paper seeks to elaborate on the disambiguation of Persian words with the same written form but different senses using a combination of supervised and unsupervised method which is conducted by means of thesaurus and corpus. The present method is based on a previously proposed one with several differences. These differences include the use of texts which have been collected by supervised or unsupervised method. In addition, the words of the input corpus were stemmed, and in the case of those words whose different senses have different roles in the sentence, the role of the word in the input sentence was considered for disambiguation. Applying this method to the selected ambiguous words from “Hamshahri”, which is a standard Persian corpus, we achieved to a satisfactory accuracy of 97 percent in the results, and evaluated the presented method as a better and more efficient in comparison with the similar methods.
References
- Fararooy, J., Thesaurus and Electronic transfer of Persian language content, In: 2nd workshop on Persian language and computer, Tehran, Iran, 2004.
- Fararooy, J., Thesaurus of Persian Words and Phrases, 1999.
- Gausted, T., Linguistic Knowledge and Word Sense Disambiguation, PhD dissertation, Groningen University, 2004.
- Ide, N., Veronis, J., Introduction to the Special Issue on Word Sense Disambiguation: The State of the Art, Computational inguistics 24(1), 1-40, 1998.
- Makki, R., Homayounpour, M., Word Sense Disambiguation of Farsi homographs Using Thesaurus and Corpus, Amirkabir University of Technology, Tehran, Iran, 2008.
Paper Citation
in Harvard Style
Jani F. and Pilevar A. (2012). Word Sense Disambiguation of Persian Homographs . In Proceedings of the 7th International Conference on Software Paradigm Trends - Volume 1: ICSOFT, ISBN 978-989-8565-19-8, pages 328-331. DOI: 10.5220/0004031703280331
in Bibtex Style
@conference{icsoft12,
author={F. Jani and A.H. Pilevar},
title={Word Sense Disambiguation of Persian Homographs},
booktitle={Proceedings of the 7th International Conference on Software Paradigm Trends - Volume 1: ICSOFT,},
year={2012},
pages={328-331},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004031703280331},
isbn={978-989-8565-19-8},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 7th International Conference on Software Paradigm Trends - Volume 1: ICSOFT,
TI - Word Sense Disambiguation of Persian Homographs
SN - 978-989-8565-19-8
AU - Jani F.
AU - Pilevar A.
PY - 2012
SP - 328
EP - 331
DO - 10.5220/0004031703280331