Convolutional Neural Network's Stacking Classifier on
Cardiovascular Disease
Chuhong Zhou
Jinan University-University of Birmingham Joint Institute, Jinan University, Guangzhou, Guangdong, 511443, China
Keywords: Ensemble Learning, Classification, Prediction, Machine Learning
Abstract: Cardiovascular disease is the leading cause of death worldwide. To diagnose cardiovascular disease, multiple
risk indicators need to be combined, which is a challenge for limited medical resources. To reduce
misdiagnosis, machine learning is being used to predict cardiovascular disease. Due to the inherent defect of
algorithms, the results of a single model will produce certain errors. To improve prediction accuracy,
Ensemble Learning combines several machine learning algorithms. However, Convolutional Neural Network
(CNN), as an algorithm of machine learning, is not sufficiently applied in predicting problems of
cardiovascular disease. Part of the data collected by the Centers for Disease Control and Prevention (CDC) in
2022 was used in this experiment, and pre-processing operations such as feature selection, Undersampling,
and Synthetic Minority Over-sampling Technique (SMOTE) were performed. This experiment tested the
accuracy of using a CNN as the base learner and meta-learner for the stacking model and compared it with
traditional algorithms. The results show that the accuracy of the ensemble learning model that integrates CNN
is 91.13, which is higher than the traditional algorithm compared to it.
1 INTRODUCTION
Heart disease is a disease involving the heart and
blood vessels, including coronary heart disease,
cerebrovascular disease, rheumatic heart disease, and
other related diseases. The heart is second only to the
brain as an important organ of the human body, and
cardiovascular diseases have a huge impact on
patients. According to the World Health Organization
(WHO) estimating, in 2019, heart disease accounted
for 32% of global deaths, totaling approximately 17.9
million people (World Health Organization 2021).
Research has shown that using a wide range of
intervention measures to prevent cardiovascular
disease is cost-effective in both low - and middle-
income areas (Shroufi et al. 2013).
However, there are problems in the diagnosis of
cardiovascular disease. The risk indicators related to
cardiovascular disease include blood pressure,
myocardial enzymes, low-density lipoprotein
cholesterol, and other indicators. Personal lifestyle
also has an impact on the incidence rate, such as
smoking, diet, obesity, and lack of exercise (Tsao et
al. 2023). Doctors need to identify, quantify, and
explain the relationships between variables. To
accurately diagnose heart disease, skilled and
experienced doctors and excellent medical equipment
are required, which is a challenge for both society and
the economy.
Therefore, when predicting cardiovascular
diseases, it is necessary to introduce the excellent
information processing ability and computing speed
of the computer. Machine learning is a branch of
computers. With the increasing amount and
complexity of available data and the improvement of
computer computing power, machine learning can
learn from the ever-increasing data, and it is possible
to use artificial intelligence to accelerate and enhance
the research and clinical application of heart disease
(Jone et al. 2022).
In the past few years, scholars and researchers
have attempted to apply machine learning to disease
prediction and have tried various algorithms, such as
Decision Tree, k-nearest neighbor algorithm, and
Random Forests, and achieved good experimental
results (Sudheer et al. 2021).
However, a single machine learning model may
produce some errors in predicting results when facing
complex problems due to the differences in algorithm
logic and computational methods. Ensemble learning
is a method of combining multiple foundational
models to form a more powerful predictive model.
116
Zhou, C.
Convolutional Neural Network’s Stacking Classifier on Cardiovascular Disease.
DOI: 10.5220/0012827800004547
Paper published under CC license (CC BY-NC-ND 4.0)
In Proceedings of the 1st International Conference on Data Science and Engineering (ICDSE 2024), pages 116-121
ISBN: 978-989-758-690-3
Proceedings Copyright © 2024 by SCITEPRESS – Science and Technology Publications, Lda.
Research has shown that applying ensemble learning
to cardiovascular disease prediction can leverage the
complementarity between different models and
provide more reliable decision-making results
(Ahmed et al. 2022).
In addition, cardiovascular disease is not a single
disease, and there is a certain correlation between
various complications. The interconnection of
parameters may change the prediction results (Zhou
et al. 2023). In actual prediction, it is necessary to
make comprehensive judgments and extract features.
Convolutional Neural Network (CNN) is a deep
learning algorithm widely used in image processing
and pattern recognition. It has a strong feature
extraction ability and can capture local structures in
images. However, the application in data analysis is
insufficient.
Therefore, the objective of this project is to use
CNN as a base learner and a meta-learner to test the
accuracy of ensemble learning models containing
CNN in predicting cardiovascular diseases and
compare the prediction accuracy with ensemble
learning models composed of other classifiers.
2 METHODOLOGY
2.1 Dataset
The data for this experiment was initially obtained
from the Behavioral Risk Factors Monitoring System
(BRFSS), which is an annual telephone survey
conducted by its subsidiary, the Centers for Disease
Control and Prevention (CDC) (Kaggle 2023). Select
246022 samples and 40 health indicators from
approximately 400000 samples and 300 health
indicators. Among them, 232587 had no or no history
of heart disease, and 13435 had a history of heart
disease.
2.2 Pre-Processing
Features that have little or no relevance to
classification results in the data set will reduce the
model performance and increase the computational
cost. Therefore, feature selection is required before
training the model (Abdollahi & Nouri-Moghaddam
2022). In this experiment, Relief and FCBF
algorithms were used to screen valuable
characteristics of heart disease. In addition, as shown
in Figure 1, the model structure will be re-selected
after each training round based on the training results.
Figure 1. The structure of the experimental model (Photo
credit: Original)
Figure 2 shows the distribution of samples. The
number of samples suffering from heart disease is
higher than the number of samples without heart
disease. There is a class-imbalance in this dataset.
When training a model, the model may be more
inclined to predict classes with a larger sample size,
resulting in classification errors for classes with a
smaller sample size. To solve this problem, the
experiment adopts Undersampling to randomly delete
most of the samples in the dataset. Synthetic Minority
Over Sampling Technique (SMOTE) is also used to
balance the datasets. The general steps of SMOTE are
as follows:
1. Find the k nearest neighbor samples for
each minority class sample.
2. Randomly select a sample from the k
samples as the nearest neighbor sample.
3. Make a copy of the nearest neighbor
sample to generate a new composite sample.
4. Repeat Steps 2-3 until a preset number of
synthetic samples are generated.
Figure 2. The distribution of the data set (Photo credit:
Original).
2.3 Stacking Classifier
Figure 3 briefly illustrates the process of the stacking
algorithm. Stacking uses a meta learner to integrate
the prediction results of multiple base models, which
can compensate for the shortcomings of a single
model, further explore features, and better classify
data.
Convolutional Neural Network’s Stacking Classifier on Cardiovascular Disease
117
Figure 3. The structure of the stacking (Photo credit: Original).
1. The first layer is base learners, which use the
same or different training sets to train multiple base
learners. Each base learner can use the same or
different classifiers, parameters, or training sets.
2. Use all base learners of the first layer to predict
the new sample set and record the prediction results.
It is best to use the cross-validation method in
practical applications.
3. Combine the predictions of all the base learners
in the first layer and use the predictions as a new data
set.
4. Train the meta-learner using the dataset
obtained in the previous step.
2.4 Base Learners
2.4.1 Convolutional Neural Network (CNN)
The convolutional Layer, Pooling Layer, and Fully
Connected Layer constitute the basic unit of CNN
(Sudha & Kumar 2023). The core idea of CNN is to
mine the features of data by using convolution
operations. Then, reduce the dimensionality of the
feature space through pooling operations, and finally
combine advanced features through fully connected
layers to produce the final classification or regression
results.
Convolutional Layer: Figure 4 shows the process
of convolutional layer operation. The convolutional
kernel calculates the inner product of the window
corresponding to the convolutional kernel by sliding
on the input data and generating a feature map. This
feature mapping will be conveyed to the next layer.
The weights can be learned through training.
Figure 4. The working process of the convolution layer (Photo credit: Original).
Pooling Layer: Convolutional layers often mine
numerous features, which need to be appropriately
reduced through pooling operations. Common
pooling operations include Max Pooling and Average
ICDSE 2024 - International Conference on Data Science and Engineering
118
Pooling, which take the maximum or average value
from multiple features.
Fully Connected Layer: Merge all the features in
the previous layer together to convert into the final
classification or regression result.
2.4.2 Decision Tree (DT)
The samples to be trained start from the root node, and
each internal node contains a decision rule that can
gradually classify the samples. The final samples will
be divided into multiple subsets and form a tree like
structure.
2.4.3 K-nearest neighbour (KNN)
KNN is a simple and intuitive supervised learning
algorithm. The basic principle is to compare K
training samples with the closest similarity to the
predicted sample. The most commonly used decision-
making method is to select the category that appears
most frequently among the K nearest neighbor
samples.
2.4.4 Support Vector Machines (SVM)
The principle of SVM is to set a hyperplane in the
sample space, which maximizes the distance between
the hyperplane and the different class sample points
closest to the hyperplane so that the sample points can
be effectively segmented. The support vectors are the
points closest to the hyperplane, which determines the
decision boundary.
The accuracy of SVM may suddenly decrease as
the number of training samples increases (Deng
2023). Therefore, the training set size of the base
learners composed of SVM in this experiment is
smaller than that of other base learners.
2.4.5 Logistic Regression (LP)
Logistic regression is a classic algorithm that uses the
sigmoid function to establish a relationship between
input features and output features, and its results
directly reflect the probability of the sample
belonging to a certain category.
2.4.6 Naive Bayes (BY)
Naive Bayes is based on the Bayesian theorem, where
the prior probability of a certain feature is known, the
posterior probability is calculated, and then the
sample is classified with the maximum Posterior
Probability.
3 RESULTS AND ANALYSIS
In this experiment, a total of six base learners were
introduced, including DT, KNN, LR, NB, SVM, and
CNN, with a total of 30 models per model. Figure 5
shows the accuracy of all base learners in the
experiment. Among them, the average accuracy of the
CNN classifiers is 74.81%, and the highest accuracy
is 86.75%.
Figure 5. The accuracies of the base learners (Photo credit:
Original).
Figures 6 and 7 respectively show the changes in
accuracy and loss function with increasing iteration
times, and compare them with different learning rates.
When the learning rate is 0.001, the overall
performance of the model is the best, with the highest
accuracy of 91.13. And the loss is lower than other
learning rates.
Figure 6. The accuracies of different learning rates (Photo
credit: Original).
Convolutional Neural Network’s Stacking Classifier on Cardiovascular Disease
119
Figure 7. The loss of different learning rates (Original)
Accuracy represents the ratio between the
correctly classified number and the sample size. The
formula is
𝑎𝑐𝑐𝑢𝑟𝑎𝑐𝑦


(1)
Table 1 is the fusion matrix, which defines
parameters in formula (1), where True and False
represent whether the predicted and accurate results
of the model are the same. Positive and Negative
represent the categories of predicted results.
Table 1. The confusion matrix (Table credit: Original).
Actual class
Have cardiovascular disease No cardiovascular disease
Predicted class
Have cardiovascular disease True Positive (TP) False Positive (FP)
No cardiovascular disease False Negative (FN) True Negative (TN)
However, when there is a class ambiguity issue in
the dataset, i.e. the model may perform well in overall
accuracy, but its accuracy may decrease when
predicting minority categories (Amalia et al. 2019).
That is to say, there is bias in the model. Therefore,
when comparing the performance of the model,
Precision and recall were introduced as evaluation
criteria, and the formula is:
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛


(2)
𝑟𝑒𝑐𝑎𝑙𝑙


(3)
F1 score can better measure the overall
performance of the classifier on imbalanced datasets
by comprehensively considering accuracy and recall,
providing a more comprehensive and accurate
performance evaluation. The formula is:
𝐹
2


(4)
Figure 8 and Table 2 show the performance of the
CNN classifier as a meta-learner and compare it with
the other 5 classifiers. CNN achieved an accuracy of
90.9, with the F1 score being the highest among the
six classifiers. The higher the F1 score, the better the
balance between Precision and Recall achieved by the
classifier, and the more accurate its predictive ability
for positive and negative class samples. Therefore,
CNN models have good performance in predicting
cardiovascular disease problems.
Figure 8. The comparison of different meta-learners
(Photo credit: Original).
Table 2. The F1 score of different meta-learners (Table credit: Original).
Algorithm CNN KNN NB DT LR SVM
F1-score 0.7080 0.4608 0.4930 0.5600 0.5462 0.4528
ICDSE 2024 - International Conference on Data Science and Engineering
120
4 CONCLUSION
Cardiovascular disease is the disease that causes the
highest number of deaths and has a huge impact on
human health. To accurately predict cardiovascular
diseases, ensemble learning has been widely applied
in this field.
In this experiment, CNN was incorporated into
ensemble learning as a base learner and a meta-
learner. When performing feature selection, Relief,
and FCBF algorithms were used, combined with
model prediction accuracy. Undersampling and
SMOTE algorithms were used to solve the class
balance problem of data. The experiment also tested
the performance of DT, KNN, LR, NB, SVM and
CNN as base learners and meta-learners, and
compared them. The results show that CNN has
excellent performance in processing cardiovascular
disease data, with a prediction accuracy of 91.13. It
also outperforms other traditional algorithms in the F1
score.
About suggestions for further research, CNN can
be used in the diagnosis of more diseases in the future,
such as cancer, diabetes, respiratory diseases, etc.
REFERENCES
A. Shroufi, R. Chowdhury, R. Anchala, BMC Public
Health, 13(1), 1-1, (2013).
C. W. Tsao, A. W. Aday, Z. I. Almarzooq, Circulation,
147(8), e93-e62, (2023).
D. Sudheer, A. Potti, N. Anjali Devi, et al, International
Journal of Computer Sciences and Engineering, 9(8):
27-29, (2021).
D. Zhou, H. Qiu, M. Shen, et al, BMC Medical Informatics
and Decision Making, 23(1), 99. (2023).
J. Abdollahi, B. Nouri-Moghaddam, Iran Journal of
Computer Science, 5(3), 229-246, (2022).
J. Deng, Engineering and Technology, 38, 187-198, (2023).
Kaggle, Indicators of Heart Disease (2022 UPDATE),
2023, available at
https://www.kaggle.com/datasets/kamilpytlak/personal
-key-indicators-of-heart-disease
L. Amalia, C. Alejandro, M. Alejandr, et al, Pattern
Recognition, 91, 216-231, (2019).
P. N. Jone, A. Gearhart, H. Lei, et al, JACC: Advances,
1(5), 100153, (2022).
S. Ahmed, S. Shaikh, F. Ikran, et al, Journal of Sensors,
(2022).
V. K. Sudha, D. Kumar, Engineering and Technology, 38,
187-198, (2023).
World Health Organization, Cardiovascular diseases
(CVDs), 2021, available at
https://www.who.int/en/news-room/fact-
sheets/detail/cardiovascular-diseases-(cvds)
Convolutional Neural Network’s Stacking Classifier on Cardiovascular Disease
121