Authors:
José Manuel Cadenas
;
María del Carmen Garrido
and
Raquel Martínez
Affiliation:
University of Murcia, Spain
Keyword(s):
Feature Selection, Low Quality Data, Fuzzy Random Forest, Fuzzy Decision Tree.
Related
Ontology
Subjects/Areas/Topics:
Approximate Reasoning and Fuzzy Inference
;
Artificial Intelligence
;
Computational Intelligence
;
Fuzzy Systems
;
Pattern Recognition: Fuzzy Clustering and Classifiers
;
Soft Computing
Abstract:
Feature selection is an active research in machine learning. The main idea of feature selection is to choose a subset of available features, by eliminating features with little or no predictive information, and features strongly correlated. There are many approaches for feature selection, but most of them can only work with crisp data. Until our knowledge there are not many approaches which can directly work with both crisp and low quality (imprecise and uncertain) data. That is why, we propose a new method of feature selection which can handle both crisp and low quality data. The proposed approach integrates filter and wrapper methods into a sequential search procedure with improved classification accuracy of the features selected. This approach consists of steps following: (1) Scaling and discretization process of the feature set; and feature pre-selection using the discretization process (filter); (2) Ranking process of the feature pre-selection using a Fuzzy Random Forest ensembl
e; (3) Wrapper feature selection using a Fuzzy Decision Tree technique based on cross-validation. The efficiency and effectiveness of the approach is proved through several experiments with low quality datasets. Approach shows an excellent performance, not only classification accuracy, but also
with respect to the number of features selected.
(More)