times from a language whose spelling was not yet standardised. 40.58% of
undergenerations in desuffixation involve other languages.
6 Conclusions and Proposed Future Research
A linguistically-motivated and multilingually aware approach to discovering morpho-
semantic relations has been demonstrated, which outperforms the CatVar database
and can be applied directly to any lexicon without other resources.
There is plenty of scope for enriching WordNet with data relating to derivational
morphology. The Java model of WordNet is a firm foundation for implementing and
demonstrating this enrichment. A set of new types of relation has been proposed to
capture the semantics . Further research will verify their applicability.
Some morphological rules are unreliable as implemented, and need more rigorous
formulations. Implementation of appropriate word length thresholds would allow the
automatic processing of regular longer words while shorter words are checked
manually. Further rules could be formulated by examining suffixes in the lexicon
without CatVar.
Some of the most important morphological rules have not been implemented, for
lack of multilingual resources. Others have been implemented monolingually,
accounting for much overgeneration. The most important cause of undergeneration is
non-implementation of multilingual rules, especially with reference to Latin
participles. Implementing these rules is the most important single enhancement that
could be made. This will be a significant area of further research, leading to a fully
enriched morpho-semantic database.
References
1. Bilgin, O., Çetinoğlu, Ö. & Oflazer, K. (2004). Morphosemantic Relations In and Across
Wordnets, Proceedings of the Second International WordNet Conference, Brno, Czech
Republic, January 20-23, 2004, 60-66.
2. Bosch, S., Fellbaum, C. & Pala, K. (2008). Enhancing WordNets with Morphological
Relations: A Case Study from Czech, English and Zulu, Proceedings of the Fourth Global
WordNet Conference, Szeged, Hungary, Jan. 22-5 2008, 74-90.
3. COED (1971-80). The Compact Edition of the Oxford English Dictionary, Complete Text
Reproduced Micrographically, Oxford University Press.
4. Fellbaum, C. (ed.) (1998). WordNet: An Electronic Lexical Database, Cambridge, MA.,
MIT Press.
5. Goldsmith, J. (2001). Unsupervised Learning of the Morphology of a Natural Language,
Computational Linguistics, 27, 153-198.
6. Habash, N. & Dorr, B. (2003). A Categorial Variation Database for English, Proceedings of
the North American Association for Computational Linguistics, Edmonton, Canada, 96-
102.
7. Hathout, N. (2008). Acquisition of the morphological structure of the lexicon based on
lexical similarity and formal analogy, Proceedings of the 3rd Textgraphs workshop on
Graph-based Algorithms for Natural Language Processing, 22nd International Conference
on Computational Linguistics , Manchester, 24 August, 2008.
44