COMPUTATION OF THE SEMANTIC RELATEDNESS BETWEEN WORDS USING CONCEPT CLOUDS

Swarnim Kulkarni, Doina Caragea

2009

Abstract

Determining the semantic relatedness between two words refers to computing a statistical measure of similarity between those words. Word similarity measures are useful in a wide range of applications such as natural language processing, query recommendation, relation extraction, spelling correction, document comparison and other information retrieval tasks. Although several methods that address this problem have been proposed in the past, effective computation of semantic relatedness still remains a challenging task. In this paper, we propose a new technique for computing the relatedness between two words. In our approach, instead of computing the relatedness between the two words directly, we propose to first compute the relatedness between their generated concept clouds using web-based coefficients. Next, we use the obtained measure to determine the relatedness between the original words. Our approach heavily relies on a concept extraction algorithm that extracts concepts related to a given query and generates a concept cloud for the query concept. We perform an evaluation on the Miller-Charles benchmark dataset and obtain a correlation coefficient of 0.882, which is better than the correlation coefficients of all other existing state of art methods, hence providing evidence for the effectiveness of our method.

References

  1. Bollegala, D., Matsuo, Y., and Ishizuka, M. (2007). Measuring semantic similarity between words using web search engines. In Proc. of WWW 2007.
  2. Chen, H., Lin, M., and Wei, Y. (2006). Novel association measures using web search with double checking. In In Proc. of the COLING/ACL 2006, pages 1009-1016.
  3. Chernov, S., Iofciu, T., Nejdl, W., and Zhou, X. (2006). Extracting semantic relationships between wikipedia categories. In Proc. of SemWiki2006 Workshop, colocated with ESWC2006.
  4. Cilibrasi, R. and Vitanyi, P. (2007). The google similarity distance. In IEEE Transactions on Knowledge and Data Engineering, pages 370-383.
  5. Deerwester, S., Dumais, S., Furnas, G., Landauer, T., and Hashman, R. (1990). Indexing by latent semantic indexing. In Journal of the Amer. Soc. for Inf. Science.
  6. Gabrilovich, E. and Markovitch, S. (2007). Computing semantic relatedness using wikipedia-based explicit semantic analysis. In Proc. of the Int. Joint Conf. on Artificial Intelligence (IJCAI-07), pages 1606-1611.
  7. Gracia and Mena (2008). Web-based measure of semantic relatedness. In Proc. of the 9th Int. Conf. on Web Information Systems Engineering, pages 136-150.
  8. Hirst, G. and St-Onge, D. (1998). Lexical chains as representations of context for the detection and correction of malapropisms. In C. Fellbaum Ed., pages 305-332. MIT Press.
  9. Jarmasz, M. and Szpakowicz, S. (2003). Rogets thesaurus and semantic similarity. In Proc. of RANLP-03, pages 212-219.
  10. Kulkarni, S. and Caragea, D. (2009). Towards bridging the web and the semantic web. In Proc. of WI/IAT 2009.
  11. Leacock, C. and Chodorow, M. (1998). Combining local context and wordnet similarity for word sense identification. In C. Fellbaum Ed., pages 265-283. MIT Press.
  12. Lesk, M. (1986). Automatic sense disambiguation using dictionaries. In Proc. of the 5th Int. Conf. on Systems Documentation.
  13. Milne, D. and Witten, I. (2008). An effective, low-cost measure of semantic relatedness obtained from wikipedia links. In Proc. of AAAI08 Workshop on Wikipedia and Artificial Intelligence, Chicago,IL.
  14. Rada, R., Milli, H., Bicknell, E., and Blettner, M. (1989). Development and application of a metric to semantic nets. In IEEE Transactions on Systems, Man and Cybernetics, pages 17-30.
  15. Sahami, M. and Heilman, T. (2006). A web-based kernel function for measuring the similarity of short text snippets. In Proc. of 15th Int. WWW Conf.
  16. Salahli, M. A. (2009). An approach for measuring semantic relatedness via related terms. In Mathematical and Comp. Applications, volume 14, pages 55-63.
  17. Strube, M. and Ponzetto, S. (2005). Exploiting semantic role labeling, wordnet and wikipedia for coreference resolution. In Proc. of the ACL HLT Conf.
  18. Zesch T., Mller Christof, G. I. (2008). Using wiktionary for computing semantic relatedness. In Proceedings of AAAI 2008, pages 861-868.
Download


Paper Citation


in Harvard Style

Kulkarni S. and Caragea D. (2009). COMPUTATION OF THE SEMANTIC RELATEDNESS BETWEEN WORDS USING CONCEPT CLOUDS . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2009) ISBN 978-989-674-011-5, pages 183-188. DOI: 10.5220/0002303701830188


in Bibtex Style

@conference{kdir09,
author={Swarnim Kulkarni and Doina Caragea},
title={COMPUTATION OF THE SEMANTIC RELATEDNESS BETWEEN WORDS USING CONCEPT CLOUDS},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2009)},
year={2009},
pages={183-188},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002303701830188},
isbn={978-989-674-011-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2009)
TI - COMPUTATION OF THE SEMANTIC RELATEDNESS BETWEEN WORDS USING CONCEPT CLOUDS
SN - 978-989-674-011-5
AU - Kulkarni S.
AU - Caragea D.
PY - 2009
SP - 183
EP - 188
DO - 10.5220/0002303701830188