Convolutional Neural Network's Stacking Classifier on
Cardiovascular Disease
Chuhong Zhou
Jinan University-University of Birmingham Joint Institute, Jinan University, Guangzhou, Guangdong, 511443, China
Keywords: Ensemble Learning, Classification, Prediction, Machine Learning
Abstract: Cardiovascular disease is the leading cause of death worldwide. To diagnose cardiovascular disease, multiple
risk indicators need to be combined, which is a challenge for limited medical resources. To reduce
misdiagnosis, machine learning is being used to predict cardiovascular disease. Due to the inherent defect of
algorithms, the results of a single model will produce certain errors. To improve prediction accuracy,
Ensemble Learning combines several machine learning algorithms. However, Convolutional Neural Network
(CNN), as an algorithm of machine learning, is not sufficiently applied in predicting problems of
cardiovascular disease. Part of the data collected by the Centers for Disease Control and Prevention (CDC) in
2022 was used in this experiment, and pre-processing operations such as feature selection, Undersampling,
and Synthetic Minority Over-sampling Technique (SMOTE) were performed. This experiment tested the
accuracy of using a CNN as the base learner and meta-learner for the stacking model and compared it with
traditional algorithms. The results show that the accuracy of the ensemble learning model that integrates CNN
is 91.13, which is higher than the traditional algorithm compared to it.
1 INTRODUCTION
Heart disease is a disease involving the heart and
blood vessels, including coronary heart disease,
cerebrovascular disease, rheumatic heart disease, and
other related diseases. The heart is second only to the
brain as an important organ of the human body, and
cardiovascular diseases have a huge impact on
patients. According to the World Health Organization
(WHO) estimating, in 2019, heart disease accounted
for 32% of global deaths, totaling approximately 17.9
million people (World Health Organization 2021).
Research has shown that using a wide range of
intervention measures to prevent cardiovascular
disease is cost-effective in both low - and middle-
income areas (Shroufi et al. 2013).
However, there are problems in the diagnosis of
cardiovascular disease. The risk indicators related to
cardiovascular disease include blood pressure,
myocardial enzymes, low-density lipoprotein
cholesterol, and other indicators. Personal lifestyle
also has an impact on the incidence rate, such as
smoking, diet, obesity, and lack of exercise (Tsao et
al. 2023). Doctors need to identify, quantify, and
explain the relationships between variables. To
accurately diagnose heart disease, skilled and
experienced doctors and excellent medical equipment
are required, which is a challenge for both society and
the economy.
Therefore, when predicting cardiovascular
diseases, it is necessary to introduce the excellent
information processing ability and computing speed
of the computer. Machine learning is a branch of
computers. With the increasing amount and
complexity of available data and the improvement of
computer computing power, machine learning can
learn from the ever-increasing data, and it is possible
to use artificial intelligence to accelerate and enhance
the research and clinical application of heart disease
(Jone et al. 2022).
In the past few years, scholars and researchers
have attempted to apply machine learning to disease
prediction and have tried various algorithms, such as
Decision Tree, k-nearest neighbor algorithm, and
Random Forests, and achieved good experimental
results (Sudheer et al. 2021).
However, a single machine learning model may
produce some errors in predicting results when facing
complex problems due to the differences in algorithm
logic and computational methods. Ensemble learning
is a method of combining multiple foundational
models to form a more powerful predictive model.
Zhou, C.