Authors:
Catarina Silva
1
;
Bernardete Ribeiro
1
;
Uroš Lotrič
2
and
Andrej Dobnikar
2
Affiliations:
1
University of Coimbra, CISUC, Portugal
;
2
University of Ljubljana, Faculty of Computer and Information Science, Slovenia
Keyword(s):
Text Mining, Cluster Computing.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Biomedical Engineering
;
Business Analytics
;
Data Engineering
;
Data Mining
;
Databases and Information Systems Integration
;
Datamining
;
Enterprise Information Systems
;
Health Information Systems
;
Sensor Networks
;
Signal Processing
;
Soft Computing
Abstract:
In today’s society, individuals and organizations are faced with an ever growing load and diversity of textual information and content, and with increasing demands for knowledge and skills. In this work we try to answer part of these challenges by addressing text classification problems, essential to managing knowledge, by combining several different pioneer kernel-learning machines, namely Support Vector Machines and Relevance Vector Machines. To excel complex learning procedures we establish a model of high-performance distributed computing environment to help tackling the tasks involved in the text classification problem.
The presented approach is valuable in many practical situations where text classification is used. Reuters-21578 benchmark data set is used to demonstrate the strength of the proposed system while different ensemble based learning machines provide text classification models that are efficiently deployed in the Condor and Alchemi platforms.