2 RELATED RESEARCH
Traditional methods for predicting heart disease often
rely on the experience of doctors and manual analysis,
limited by human cognitive ability and information
processing speed, which can easily lead to subjective
biases and misjudgments. With the rapid
advancement of technology and the era, people are
increasingly realizing the advantages of integrating
artificial intelligence and machine learning models
into heart disease prediction. Weng et al. pointed out
in a prospective cohort study on predicting
cardiovascular disease that machine learning
algorithms significantly improved the prediction of
cardiovascular disease, confirming the effectiveness
and feasibility of machine learning techniques in
cardiovascular disease prediction (Weng , 2017).
For example, Li Linghai compared the
effectiveness of traditional feature extraction
algorithms and deep learning methods in the
classification of echocardiograms, finding that deep
learning significantly improved the classification
accuracy, especially in the classification of heart
disease (Li,2017 ) .Cheng Z. and other scholars
integrated Random Forest with SHAP for heart
disease prediction, and found that it can predict heart
disease more accurately (Cheng, 2023). Liu Yunlong
and other scholars conducted feature selection
research on heart disease prediction based on GBM,
and found that this method can effectively improve
the efficiency of medical diagnosis (Liu, 2023 ).
Shafiey M G and other scholars introduced an
efficient hybrid genetic algorithm and particle swarm
optimization method based on Random Forest to
optimize the feature extraction process, in order to
select key features that enhance the accuracy of heart
disease diagnosis (Shafiey, 2022 ).
Mohan et al. proposed a heart disease prediction
model called HRFLM, constructed using a linear
mixed RF algorithm. The model improved the level
of disease prediction performance, achieving an
accuracy of 88.7% (Mohan, 2019 ). Ali et al. in 2002
utilized an adapted form of the Health Belief Model,
selecting a dataset of 178 female patients with
coronary heart disease, and conducted an analysis of
coronary heart disease risk factors prediction
(Ali,2002). In 2009, Avci utilized genetic algorithms
to optimize parameters of the Support Vector
Machine model and experimentally validated it on
heart disease data. The research results indicated that
this method could achieve better predictive
performance (Avci, 2009). In 2013, Amin et al.
proposed a hybrid model combining artificial neural
networks with genetic algorithms, aimed at
optimizing the connection weights of neural networks
to improve the predictive performance of artificial
neural networks. The model utilized 50 identified
important risk factors to predict heart disease, and the
research results demonstrated an accuracy of 89% for
the predictive model (Amin , 2013).
Previous studies have utilized models such as
Logistic Regression, Decision Trees, and Deep
Neural Networks to analyze the related data, and
these models indeed exhibit a certain level of
accuracy and play an important role to some extent.
However, when faced with large and dense datasets,
these traditional models still have some limitations,
particularly in discovering higher-order relationships
and achieving further precision. However, GBDT,
which this experiment is based on, can effectively
address the shortcomings of previous models
3 METHOD
By comparing different machine learning models and
using performance comparison metrics, we determine
the optimal model. These models include, but are not
limited to, GBDT and LR. We compare their
performance in predicting heart disease and select the
best-performing model as the final conclusion (Fig. 1).
Figure 1: Research Workflow Diagram (Photo/Picture credit: Original).