Authors:
L. Hedjazi
1
;
M.-V. Le Lann
1
;
T. Kempowsky-Hamon
1
;
F. Dalenc
2
and
G. Favre
2
Affiliations:
1
CNRS and LAAS, France
;
2
INSERM U563 and Institut Claudius Regaud, France
Keyword(s):
Feature selection, Fuzzy logic, Mixed-Type Data, Breast cancer prognosis.
Related
Ontology
Subjects/Areas/Topics:
Algorithms and Software Tools
;
Bioinformatics
;
Biomedical Engineering
;
Data Mining and Machine Learning
;
Databases and Data Management
;
Pattern Recognition, Clustering and Classification
Abstract:
Clinical factors, such as patient age and histo-pathological state, are still the basis of day-to-day decision for cancer management. However, with the high throughput technology, gene expression profiling and proteomic sequences have known recently a widespread use for cancer and other diseases management. We aim through this work to assess the importance of using both types of data to improve the breast cancer prognosis. Nevertheless, two challenges are faced for the integration of both types of information: high-dimensionality and heterogeneity of data. The first challenge is due to the presence of a large amount of irrelevant genes in microarray data whereas the second is related to the presence of mixed-type data (quantitative, qualitative and interval) in the clinical data. In this paper, an efficient fuzzy feature selection algorithm is used to alleviate simultaneously both challenges. The obtained results prove the effectiveness of the proposed approach.