Authors:
Artur Ferreira
1
;
2
and
Mário Figueiredo
1
;
3
Affiliations:
1
Instituto de Telecomunicações, Lisboa, Portugal
;
2
ISEL, Instituto Superior de Engenharia de Lisboa, Instituto Politecnico de Lisboa, Portugal
;
3
IST, Instituto Superior Tecnico, Universidade de Lisboa, Portugal
Keyword(s):
Cancer Detection, Classification, Feature Selection, Filter, Gene Expression, Microarray Data, Union.
Abstract:
Cancer detection from microarray data is an important problem to be handled by machine learning techniques. This type of data poses many challenges to machine learning techniques, namely because it usually has large number of features (genes) and small number of instances (patients). Moreover, it is important to characterize which genes are the most important for a given classification task, providing explainability on the classification. In this paper, we propose a feature selection approach for microarray data, which is an extension of the recently proposed k-fold feature selection algorithm. We propose performing the union of the feature subspaces found independently by two feature selection filters, which have been proven to be adequate for this type of data, individually. The experimental results show that the union of the subsets of features found by each filter, in some cases, produces better results than the use of each individual filter, yielding human manageable subsets of
features.
(More)