Author:
Tom Richens
Affiliation:
Aston University, United Kingdom
Keyword(s):
lexical resource, morpho-semantic, WordNet, enrichment, CatVar, suffixation, desuffixation, overgeneration, undergeneration, multilingual, Latin, derivational morphology, etymology, morphological rule, participle
Abstract:
Performance of NLP systems can only be as good as the lexical resources they employ. By modelling the evolved structure of language, there is scope for morpho-semantic enrichment of these resources. A set of linguistically-informed morphological rules is formulated from the CatVar database, implemented in a Java model of WordNet and tested on suffixation and desuffixation. Overgeneration and undergeneration are measured and an approach to improving these by using multilingual resources is proposed.