A Divergence from Randomness Framework of WordNet Synsets’ Distribution for Word Sense Disambiguation

Kostas Fragos, Christos Skourlas

Abstract

We describe and experimentally evaluate a method for word sense disambiguation based on measuring the divergence from the randomness of the WordNet synsets’ distribution in the context of a word that is to be disambiguated (target word). Firstly, for each word appearing in the context we collect its related synsets from WordNet using WordNet relations, and creating thus the bag of the related synsets for the context. Secondly, for each one of the senses of the target word we study the distribution of its related synsets in the context bag. Assigning a theoretical random process for these distributions and measuring the divergence from the random process we conclude the correct sense of the target word. The method was evaluated on English lexical sample data from the Senseval-2 word sense disambiguation competition, and exhibited remarkable performance compared to / better than most known WordNet relations based measures for word sense disambiguation. Moreover, the method is general and can conduct the disambiguation task assigning any random process for the distribution of the related synsets and using any measure to quantify the divergence from randomness.

References

  1. Miller G., Beckwith R., Fellbaum C., Gross D., Miller K.: Introduction to WordNet: An On-line Lexical Database, Five Papers on WordNet, Princeton University (1993).
  2. Lesk M.: Automatic sense disambiguation: How to tell a pine cone from an ice cream cone, in Proceedings of the 1986 SIGDOC Conference, Pages 24-26, New York. Association of Computing Machinery (1986).
  3. Sussna, M.: Word sense disambiguation for free-test indexing using a massive semantic network. In Proceedings of the 2nd International Conference on Information and Knowledge Management. Arlington, Virginia, USA (1993).
  4. Agirre E. and Rigau G.: Word Sense Disambiguation Using Conceptual Density. Proceedings of 16th International Conference on COLING. Copenhagen, (1996).
  5. Resnik P.: WordNet and distributional analysis: A class-based approach to lexical discovery. Statistically-Based Natural-Language-Processing Techniques: Papers from AAAI (1992).
  6. McCarthy D., Koeling R., Weeds J. and Carroll, J.: Finding predominant word senses in untagged text. In Proceedings of the 42nd Meeting of the Association for Computational Linguistics (ACL'04), Main Volume, 279-286, (2004).
  7. Patwardhan S.: Incorporating dictionary and corpus information into a context vector measure of semantic relatedness, Master's thesis, University of Minnesota, Duluth (2003).
  8. Mihalcea R. and Moldovan D.: Automatic Acquisition of Sense tagged Corpora. American Association for Artificial Intelligence (1999).
  9. Fragos K., Maistros I. and Skourlas C.: Using Wordnet Lexical Database and Internet to Disambiguate Word Senses, in Proceedings of 9th Panhellenic Conference in Informatics, Thessaloniki Greece, 20-22 Oct. (2003).
  10. Banerjee S., Pedersen T.: An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet, in Proceedings of Third International Conference on Intelligent Text Processing and Computational Linguistics (CICLING-02), Mexico City, Mexico (2002).
  11. Padwardhan S., Banerjee S., Pedersen T.: Using measures of semantic relatedness for word sense disambiguation. In proceedings of the Fourth International Conference on Intelligent text Processing and Computational Linguistics, Mexico City, (2003).
  12. Pedersen T., Banerjee S., Padwardhan S.: Maximizing Semantic Relatedness to Perform Word Sense Disambiguation. Preprint submitted to Elsevier Science, 8 March (2005).
  13. Leacock C., Chodorow M.: Combining Local Context and WordNet 5 Similarity for Word Sense Disambiguation. Wordnet: An Electronic Lexical Database, Christiane Fellbaum (1998).
  14. Gale W., Church W. K., Yarowski D.: A Method for Disambiguating Word Senses in a Large Corpus, in Computers and Humanities 26, 1992
  15. http://www.sle.sharp.co.uk/senseval2, 2002.
Download


Paper Citation


in Harvard Style

Fragos K. and Skourlas C. (2006). A Divergence from Randomness Framework of WordNet Synsets’ Distribution for Word Sense Disambiguation . In Proceedings of the 3rd International Workshop on Natural Language Understanding and Cognitive Science - Volume 1: NLUCS, (ICEIS 2006) ISBN 978-972-8865-50-4, pages 71-80. DOI: 10.5220/0002499700710080


in Bibtex Style

@conference{nlucs06,
author={Kostas Fragos and Christos Skourlas},
title={A Divergence from Randomness Framework of WordNet Synsets’ Distribution for Word Sense Disambiguation},
booktitle={Proceedings of the 3rd International Workshop on Natural Language Understanding and Cognitive Science - Volume 1: NLUCS, (ICEIS 2006)},
year={2006},
pages={71-80},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002499700710080},
isbn={978-972-8865-50-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 3rd International Workshop on Natural Language Understanding and Cognitive Science - Volume 1: NLUCS, (ICEIS 2006)
TI - A Divergence from Randomness Framework of WordNet Synsets’ Distribution for Word Sense Disambiguation
SN - 978-972-8865-50-4
AU - Fragos K.
AU - Skourlas C.
PY - 2006
SP - 71
EP - 80
DO - 10.5220/0002499700710080