Linguistically-Motivated Automatic Morphological Analysis for Wordnet Enrichment

Tom Richens

Abstract

Performance of NLP systems can only be as good as the lexical resources they employ. By modelling the evolved structure of language, there is scope for morpho-semantic enrichment of these resources. A set of linguistically-informed morphological rules is formulated from the CatVar database, implemented in a Java model of WordNet and tested on suffixation and desuffixation. Overgeneration and undergeneration are measured and an approach to improving these by using multilingual resources is proposed.

References

  1. Bilgin, O., Çetinoglu, Ö. & Oflazer, K. (2004). Morphosemantic Relations In and Across Wordnets, Proceedings of the Second International WordNet Conference, Brno, Czech Republic, January 20-23, 2004, 60-66.
  2. Bosch, S., Fellbaum, C. & Pala, K. (2008). Enhancing WordNets with Morphological Relations: A Case Study from Czech, English and Zulu, Proceedings of the Fourth Global WordNet Conference, Szeged, Hungary, Jan. 22-5 2008, 74-90.
  3. COED (1971-80). The Compact Edition of the Oxford English Dictionary, Complete Text Reproduced Micrographically, Oxford University Press.
  4. Fellbaum, C. (ed.) (1998). WordNet: An Electronic Lexical Database, Cambridge, MA., MIT Press.
  5. Goldsmith, J. (2001). Unsupervised Learning of the Morphology of a Natural Language, Computational Linguistics, 27, 153-198.
  6. Habash, N. & Dorr, B. (2003). A Categorial Variation Database for English, Proceedings of the North American Association for Computational Linguistics, Edmonton, Canada, 96- 102.
  7. Hathout, N. (2008). Acquisition of the morphological structure of the lexicon based on lexical similarity and formal analogy, Proceedings of the 3rd Textgraphs workshop on Graph-based Algorithms for Natural Language Processing, 22nd International Conference on Computational Linguistics , Manchester, 24 August, 2008.
  8. Koeva, S., Krstev, C. & Vitas, D. (2008). Morpho-semantic Relations in WordNet: A Case Study for two Slavic Languages, Proceedings of the Fourth Global WordNet Conference, Szeged, Hungary, Jan. 22-5 2008, 239-253.
  9. Mbame, N. (2008). Towards a Morphodynamic WordNet of the Lexical Meaning, Proceedings of the Fourth Global WordNet Conference, Szeged, Hungary, Jan. 22-5 2008, 304-310.
  10. Onions, C. T. (Ed.) (1966). The Oxford Dictionary of English Etymology, Oxford, Clarendon Press.
  11. Porter, M. F. (1980). An algorithm for suffix stripping, Program, 14(3). 130-137.
  12. Simpson, D. P. (1966). Cassell's New Latin Dictionary, London, Cassell, 4th. Edition.
  13. Wong, S. H. S. (2004). Fighting Arbitrariness in WordNet-like Lexical Databases. A Natural Language Motivated Remedy, Proceedings of the Second International WordNet Conference, Brno, Czech Republic, January 20-23, 2004, 234-241.
Download


Paper Citation


in Harvard Style

Richens T. (2009). Linguistically-Motivated Automatic Morphological Analysis for Wordnet Enrichment . In Proceedings of the 6th International Workshop on Natural Language Processing and Cognitive Science - Volume 1: NLPCS, (ICEIS 2009) ISBN 978-989-8111-92-0, pages 36-45. DOI: 10.5220/0002171900360045


in Bibtex Style

@conference{nlpcs09,
author={Tom Richens},
title={Linguistically-Motivated Automatic Morphological Analysis for Wordnet Enrichment},
booktitle={Proceedings of the 6th International Workshop on Natural Language Processing and Cognitive Science - Volume 1: NLPCS, (ICEIS 2009)},
year={2009},
pages={36-45},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002171900360045},
isbn={978-989-8111-92-0},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 6th International Workshop on Natural Language Processing and Cognitive Science - Volume 1: NLPCS, (ICEIS 2009)
TI - Linguistically-Motivated Automatic Morphological Analysis for Wordnet Enrichment
SN - 978-989-8111-92-0
AU - Richens T.
PY - 2009
SP - 36
EP - 45
DO - 10.5220/0002171900360045