Authors:
Roman Sergienko
1
;
Iuliia Kamshilova
2
;
Eugene Semenkin
2
and
Alexander Schmitt
1
Affiliations:
1
Ulm University, Germany
;
2
Siberian State Aerospace University, Russian Federation
Keyword(s):
Text Classification, Term Weighting, Weighted Voting, Self-adjusting Genetic Algorithm.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Computational Intelligence
;
Engineering Applications
;
Evolutionary Computing
;
Genetic Algorithms
;
Human-Machine Interfaces
;
Industrial Engineering
;
Informatics in Control, Automation and Robotics
;
Intelligent Control Systems and Optimization
;
Optimization Algorithms
;
Performance Evaluation and Optimization
;
Robotics and Automation
;
Signal Processing, Sensors, Systems Modeling and Control
;
Soft Computing
Abstract:
The text classification problem for natural language call routing was considered in the paper. Seven different term weighting methods were applied. As dimensionality reduction methods, the combination of stop-word filtering and stemming and the feature transformation based on term belonging to classes were considered. kNN and SVM-FML were used as classification algorithms. In the paper the idea of voting with different term weighting methods was proposed. The majority vote of seven considered term weighting methods provides significant improvement of classification effectiveness. After that the weighted voting based on optimization with self-adjusting genetic algorithm was investigated. The numerical results showed that weighted voting provides additional improvement of classification effectiveness. Especially significant improvement of the classification effectiveness is observed with the feature transformation based on term belonging to classes that reduces the dimensionality radic
ally; the dimensionality equals number of classes. Therefore, it can be useful for real-time systems as natural language call routing.
(More)