Authors:
Ana Paula Silva
1
;
Arlindo Silva
1
and
Irene Rodrigues
2
Affiliations:
1
Instituto Politécnico de Castelo Branco, Portugal
;
2
Universidade de Évora, Portugal
Keyword(s):
Part-of-Speech Tagging, Disambiguation Rules, Evolutionary Algorithms, Genetic Algorithms, Natural Language Processing.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Computational Intelligence
;
Evolutionary Computing
;
Genetic Algorithms
;
Informatics in Control, Automation and Robotics
;
Intelligent Control Systems and Optimization
;
Soft Computing
Abstract:
In this paper we present an evolutionary approach to the part-of-speech tagging problem. The goal of part-of-speech tagging is to assign to each word of a text its part-of-speech. The task is not straightforward, because a large percentage of words has more than one possible part-of-speech, and the right choice is determined by the surrounding word’s part-of-speeches. This means that to solve this problem we need a method to disambiguate a word’s possible tags set. Traditionally there are two groups of methods used to tackle this task. The first group is based on statistical data concerning the different context’s possibilities for a word, while the second group is based on rules, normally designed by human experts, that capture the language properties. In this work we present a solution that tries to incorporate both these approaches. The proposed system is divided in two components. First, we use an evolutionary algorithm that for each part-of-speech tag of the training corpus, evo
lves a set of disambiguation rules. We then use a second evolutionary algorithm, guided by the rules found earlier, to solve the tagging problem. The results obtained on two different corpora are amongst the best ones published for those corpora.
(More)