Authors:
Anna Karen Garate Karen Garate Escamilla
1
;
Amir Hajjam Hajjam El Hassani
1
and
Emmanuel Andres
2
Affiliations:
1
Nanomedicine Lab, Univ. Bourgogne Franche-Comte, UTBM, F-90010 Belfort and France
;
2
Service de Médecine Interne, Diabète et Maladies métaboliques de la Clinique Médicale B, CHRU de Strasbourg, Strasbourg, France, Centre de Recherche Pédagogique en Sciences de la Santé, Faculté de Médecine de Strasbourg, Université de Strasbourg (UdS), Strasbourg and France
Keyword(s):
Machine Learning, Heart Failure, Apache Spark, Feature Selection, PCA.
Related
Ontology
Subjects/Areas/Topics:
Applications
;
Bioinformatics and Systems Biology
;
Feature Selection and Extraction
;
Pattern Recognition
;
Software Engineering
;
Theory and Methods
Abstract:
Cardiovascular diseases are the leading cause of death worldwide. Therefore, the use of computer science, especially machine learning, arrives as a solution to assist the practitioners. The literature presents different machine learning models that provide recommendations and alerts in case of anomalies, such as the case of heart failure. This work used dimensionality reduction techniques to improve the prediction of whether a patient has heart failure through the validation of classifiers. The information used for the analysis was extracted from the UCI Machine Learning Repository with data sets containing 13 features and a binary categorical feature. Of the 13 features, top six features were ranked by Chi-square feature selector and then a PCA analysis was performed. The selected features were applied to the seven classification models for validation. The best performance was presented by the ChiSqSelector and PCA models.