Machine Learning Methods for Heart Disease Prediction
Hongyu Zhou
Department of Mathematics, Southern University of Science and Technology, Shenzhen, 518000, China
Keywords: Machine Learning, Heart Disease Prediction, Neural Network, Evaluation.
Abstract: Heart disease prediction and treatment play a crucial role in enhancing human health. Numerous studies have
highlighted the effectiveness of machine learning models in predicting heart diseases. However, there still
have problems with widely use and the accuracy of the prediction. This paper aims to apply different machine
learning models, including Naïve Bayes, Decision Tree, Random Forest, XGBoost, and Neural Network
System, to a specific dataset and provide a comprehensive evaluation. After thorough analysis using various
metrics, the Random Forest model demonstrated the highest recall and F1-score among all models.
Additionally, the shallow neural networks model outperformed traditional neural network structures with
fewer parameters in this task. In conclusion, this study emphasizes the significance of machine learning
models in improving heart disease prediction and treatment. Further research and development in this area are
essential to enhance healthcare outcomes and promote overall well-being.
1 INTRODUCTION
Ischemic heart disease is the world's largest cause of
death, responsible for 16% of all deaths, according to
the World Health Organization. Meanwhile, since
2000, the situation becomes more and more strictly
(Soni et al., 2011; Chitra et al., 2022 & Chen and
Guestrin, 2016). In 2019, approximately 8.9 million
people was killed by this disease.
Over the past decade, the understanding and the
treatment technique on the ischemic heart disease has
gained a significant improvenment (Kaggle Heart
disease dataset). Due to this progress, it is important
to receice a professional medical assistance before the
disease becomes aggravated. However, there has
another problem. As for the patients, some kind of
heart disease is difficult to discover without a
specialized hospital diagnostic examination. Then
machine learning technique are expected to predict
the diease with just some basic presonal information
(Tsao et al., 2022). Fortunately, several research has
listed some features, which are easy to be detected,
are related to the heart disease. Then people with no
significant cardiovascular disease symptoms can be
detected early and can receive early medical
treatment which may reduce their death rate
significantly (Chintan et al., 2023). Thus, it is
meaningful to apply certain machine learning
methods to predict whether a person will have heart
disease with some physical indicators.
With the development of the machine learning
techiniques, there are applications have approached
the success in this field. However, there still have
problems with widely use and the accuracy of the
prediction. This paper focus on analyze 5 different
methods for heart diease prediction in the field of
machine learning and heart disease prediction
technology. Aim to find the strengths and the
weakness of different methods.
2 METHOD
2.1 Data Preprocessing
Before establishing and training models, data
preprocessing and analysis are crucial. The project
uses a remarkable dataset from Kaggle tha dataset
which is neat and well-documented.
The dataset comprises 4238 records sourced from
the Centers for Disease Control and Prevention,
National Center for Health Statistics. For all
subsequent model training, this dataset was randomly
splited into two parts: one part for training and
another for testing (no validation set was used in this
experiment). The training part and test part each
represent 80% and 20% of the total dataset,
respectively. Within this dataset, there are a total of
15 features related to predicting heart disease,