Authors:
V. G. Almeida
1
;
J. Borba
1
;
T. Pereira
1
;
H. C. Pereira
2
;
J. Cardoso
1
and
C. Correia
1
Affiliations:
1
University of Coimbra, Portugal
;
2
University of Coimbra and ISA-Intelligent Sensing Anywhere, Portugal
Keyword(s):
Data Mining, Artificial Neural Network, Clustering, Arterial Distension Waveform, Cardiovascular Diseases.
Related
Ontology
Subjects/Areas/Topics:
Bioinformatics
;
Biomedical Engineering
;
Data Mining and Machine Learning
;
Pattern Recognition, Clustering and Classification
Abstract:
Cardiovascular diseases (CVDs) are the leading cause of death in the world. The pulse wave analysis provides a new insight in the analysis of these pathologies, while data mining techniques can contribute for an efficient diagnostic method. Amongst the various available techniques, artificial neural networks (ANNs) are well established in biomedical applications and have numerous successful classification applications. Also, clustering procedures have proven to be very useful in assessing different risk groups in terms of cardiovascular function in healthy populations. In this paper, a robust data mining approach was performed for cardiac risk patterns identification. Eight classifiers were tested: C4.5, Random Forest, RIPPER, Naïve Bayes, Bayesian Network, Multy-layer perceptron (MLP) (1 and 2-hidden layers) and radial basis function (RBF). As for clustering procedures, k-means clustering (using Euclidean distance) and expectation-maximization (EM) were the chosen algorithms. Two da
tasets were used as case studies to perform classification and clustering analysis. The accuracy values are good with intervals between 88.05% and 97.15%. The clustering techniques were essential in the analysis of a dataset where little information was available, allowing the identification of different clusters that represent different risk group in terms cardiovascular function. The three cluster analysis has allowed the characterization of distinctive features for each of the clusters. Reflected wave time (T_RP) and systolic wave time (T_SP) were the selected features for clusters visualization. Data mining methodologies have proven their usefulness in screening studies due to its descriptive and predictive power.
(More)