The Comprehensive Investigation of Machine Learning-Based Patient
Brain Stroke Prediction
Xiuyuan Wei
a
Internet of Things, Concord University College Fujian Normal University, Fuzhou, Fujian, China
Keywords: Stroke Prediction, Machine Learning, Traditional Methods, Deep Learning.
Abstract: This paper aims to comprehensively review machine learning methodologies for stroke prediction, evaluating
both traditional and deep learning approaches, and discussing challenges and potential solutions in this
domain.
The paper conducts a thorough examination of machine learning methodologies for stroke prediction.
Traditional techniques are scrutinized for their efficacy in handling stroke prediction tasks across various
datasets. Deep learning approaches such as U-Net and Generative Adversarial Networks are also investigated
to assess their suitability and performance. Moreover, the review delves into the intricacies of these methods,
considering factors such as interpretability, privacy concerns, and data quality issues. Additionally, it explores
novel techniques such as the Shapley Addition Method of Interpretation and Federated Learning (FL) as
potential solutions to enhance interpretability and protect patient privacy. The review also examines the
potential of transfer learning to optimize model generalization across different domains, aiming to provide
insights into the most effective methodologies for stroke prediction. Findings suggest the promise of machine
learning in stroke prediction. Future research directions include integrating emerging techniques such as large
language models and multimodal data fusion for improved accuracy guiding researchers and practitioners in
selecting appropriate Machine Learning methods and addressing challenges in stroke prediction for enhanced
patient care.
1 INTRODUCTION
A stroke, a neurological condition, arises from
sudden, localized damage to the central nervous
system due to vascular problems, encompassing
cerebral infarction, Intracerebral hemorrhage (ICH)
(Qiu, 2020), and Subarachnoid hemorrhage (SAH). It
significantly impacts global disability and mortality
rates. Each year, stroke claims the lives of roughly 4.6
million individuals, amounting to about 9 percent of
global deaths. Beyond its lethal impact, stroke also
leads to substantial nonfatal health issues and
disabilities. While research from developed nations
shows that targeted interventions at individual,
community, and national levels can significantly
reduce the occurrence of stroke and related vascular
conditions, this knowledge has not been uniformly
implemented in developing regions. The frequency of
stroke differs markedly between populations, with
risk escalating with age or unhealthy lifestyle choices
(Azam, 2020). Within the contemporary clinical
a
https://orcid.org/0009-0003-2068-733X
realm, the diagnostic procedure for identifying stroke
is often marked by its laborious and ineffective
nature, frequently failing to efficiently manage time
and human resources while occasionally leading to
increased rates of misdiagnosis. Consequently, there
arises an urgent imperative to explore alternative
methodologies for stroke prediction, with a specific
emphasis on harnessing the capabilities of Artificial
Intelligence (AI) model due to their excellent
performance in many tasks (Sun, 2020; Wu, 2024).
This transition towards Al-based stroke prediction
represents a significant departure from traditional
diagnostic approaches and holds considerable
promise in augmenting the efficiency and precision of
diagnostic protocols, ultimately culminating in
enhanced patient outcomes and optimized allocation
of healthcare resources. In a recent study, researchers
utilized retrospective data to evaluate two distinct
algorithmic approaches: An algorithm validated
through statistical methods and another trained by
clinicians. Their findings revealed that the
362
Wei, X.
The Comprehensive Investigation of Machine Learning-Based Patient Brain Stroke Prediction.
DOI: 10.5220/0012938200004508
Paper published under CC license (CC BY-NC-ND 4.0)
In Proceedings of the 1st International Conference on Engineering Management, Information Technology and Intelligence (EMITI 2024), pages 362-367
ISBN: 978-989-758-713-9
Proceedings Copyright © 2024 by SCITEPRESS Science and Technology Publications, Lda.
implementation of Random Forest demonstrated
superior prediction accuracy compared to the
clinician-trained algorithms. Specifically, the
Random Forest model exhibited heightened
sensitivity and only marginally reduced specificity,
indicating its effectiveness in predicting outcomes
(Cox, 2016). Subsequently, researchers evaluated
three distinct supervised Machine Learning (ML)
models: Random Forests (RF), Gradient Boosting,
and U-Net. Of these ML models, Gradient Boosting
emerged as notably pertinent for forecasting tissue
outcomes following acute ischemic stroke (AIS),
closely trailed by Random Forests and U-Net in terms
of effectiveness (Benzakoun, 2021). In another study,
three prominent classification methods - Neural
Network (NN), Decision Tree (DT), and Random
Forest (RF) - were compared to predict stroke
occurrence based on patient attributes. All three
machine learning models were trained on a balanced
dataset comprising 28,524 patient records. The
analysis revealed that the features were not strongly
correlated, and it was observed that a combination of
merely four features could significantly contribute to
accurate stroke prediction (Dev, 2022). CT scans
serve as a commonly utilized dataset in the context of
stroke research and diagnosis (Sirsat, 2020),
Numerous research papers have adopted a hybrid
machine learning strategy to forecast stroke
occurrence using incomplete and unbalanced medical
datasets, recognizing the diverse implications of
employing different datasets in ML analysis (Liu,
2019).
This paper focuses on getting the current better
datasets and machine learning methods by examining
and discussing the various machine learning studies
conducted on different datasets. The paper proceeds
as outlined below: First, this paper will examine the
various machine learning methods employed in
literature to analyze and test diverse datasets in
section 2. Section 3 will delve into the insights
derived from numerous comparative analyses,
focusing on identifying and addressing dataset
challenges to enhance the efficacy of machine
learning analyses. Finally, Section 4 will provide a
summary of the paper and draw conclusions based on
the discussions presented in earlier sections.
2 METHOD
The framework of AI-based algorithms in stroke
prediction typically integrates both traditional
machine learning and deep learning approaches. It
initiates with comprehensive data collection,
gathering pertinent medical information including
demographic details, medical histories, and
diagnostic test results. Subsequently, the collected
data undergoes meticulous preprocessing to ensure its
quality and consistency, facilitating the subsequent
analysis. Leveraging a hybrid approach, the model-
building phase ensues, where both traditional
machine learning algorithms and deep learning
architectures are utilized for constructing predictive
models. This entails selecting and extracting relevant
features from the preprocessed data, a crucial step in
enhancing model performance. Following model
construction, rigorous training ensues to optimize the
models' predictive capabilities, leveraging both
machine learning and deep learning techniques to
capture complex patterns within the data.
Subsequently, the models undergo thorough testing
using established evaluation metrics to assess their
effectiveness in stroke prediction tasks. Upon
successful validation, the models are poised for
deployment, the most important is Continuous
refinement and improvement.
2.1 Traditional Machine Learning
2.1.1 Random Forest
Fernandez-Lozano, et al. focus on Random Forest-
based Stroke Outcome Prediction (Fernandez-Lozano,
2021), identified through a literature review
highlighting the superior performance of Random
Forests in biomedical applications. Data were
collected from 6022 patients, categorized into
Ischemic Stroke (IS) and Intracerebral Hemorrhage
(ICH) and IS+ ICH groups. After excluding certain
patients, the final data set was prepared. The model
underwent training using ten-fold cross-validation
along with 100 repeated randomizations to ensure
robustness and reliability. Analytical forecasts were
generated concerning both mortality and morbidity
among patient groups diagnosed with ischemic stroke
(IS), intracerebral hemorrhage, or a combination of
both (IS+ICH). The aim was to determine the primary
predictors of machine learning models, particularly
Random Forest, for generating predictive models.
2.1.2 Gradient Boosting
Xie, Yuan, et al. utilized an extreme gradient boosting
model to examine 512 patients, aiming to forecast the
Modified Rankin Score (MRS) at 90 days based on
biomarkers accessible upon admission and within 24
hours. The method employed a greedy algorithm for
feature selection and assessed model performance
The Comprehensive Investigation of Machine Learning-Based Patient Brain Stroke Prediction
363
through five-fold cross-validation. The results
indicate that decision tree-based gradient boosting
models exhibit high Area Under the Curve (AUC) in
predicting stroke patient recovery outcomes upon
admission. Additionally, stratifying patient groups
based on recanalization status may offer insights
beneficial for treatment decision-making processes
(Xie, 2019).
2.1.3 Decision Trees
Kappelhof, et al. introduce a novel algorithm that
employs an evolutionary approach to develop
decision trees that are both interpretable and powerful
for predicting adverse outcomes following
Endovascular therapy for acute ischemic stroke
(Kappelhof, 2021). Utilizing 5-fold cross-validation,
the training cohort comprised an average of 1090
patients, while the validation cohort encompassed
273 patients, achieving an average accuracy rate of
72%. In this decision tree, decision nodes contain
split-ranges rather than split-values, which are
mathematically defined Employing the notion of
belongingness and segmented linear membership
functions. The algorithm's primary aim is to balance
constraining the size of the tree to maintain
interpretability while optimizing prediction accuracy
on unseen data. The test algorithm underwent
improvement by integrating the function into the
operation of the evolutionary algorithm. Initially,
decision trees were generated using the grow method
as part of the initialization process. Following this,
the selection of individuals to advance to the
crossover phase and create the subsequent generation
took place. The common one-point crossover metho
was utilized during this phase. Furthermore, the
mutation phase was implemented to introduce
variability, with incorrect trees undergoing pruning.
Additionally, imputation and experimental setup
were integrated into the process. Finally, on average,
the fuzzy algorithm converged to its final solution
within the initial hour of execution.
2.1.4 Neural Networks
Süt et al. conducted a comprehensive study utilizing
Multilayer Perceptron (MLP) neural networks to
predict mortality in stroke patients, incorporating a
dataset of 584 individuals and examining various
prognostic factors. Six distinct MLP algorithms were
employed: Quick Propagation (QP), Levenberg-
Marquardt (LM), Backpropagation (BP), Quasi-
Newton (QN), Delta Bar Delta (DBD), and Conjugate
Gradient Descent (CGD) (Süt, 2012). The QP
algorithm, despite its potential instability, showcased
remarkable efficiency in weight adjustment
computation, yielding the highest performance
metrics including specificity, sensitivity, accuracy,
and area under the curve (AUC). LM, utilizing a least
squares estimation method, showed reasonable
performance but fell short of QP in predictive
accuracy. BP, a widely used technique, demonstrated
inferior performance compared to QP despite its
simplicity. QN, an advanced training method, did not
surpass QP in predictive accuracy despite
approximating the inverse Hessian matrix for error
gradient calculation. DBD, an alternative to BP,
exhibited promising results but did not outperform
QP. CGD, employing iterative error gradient and
search direction calculations, displayed the lowest
predictive accuracy. Overall, the study underscores
the pivotal role of algorithm selection in MLP
modelling and highlights the potential efficacy of QP-
trained models in clinical mortality prediction.
2.2 Deep Learning
2.2.1 U-Net
Li et al. conducted a study aimed at improving care
for patients with ischemic stroke by utilizing a
sophisticated multi-scale U-Net deep network model
(Li, 2021). This model was employed to segment
image features extracted from non-enhanced
computed tomography (CT) scans of 30 stroke
patients. To address the challenge of data imbalance
during model training, the authors incorporated the
Dice loss function, a metric commonly used in
medical image segmentation tasks. This function
helps in optimizing the model's performance by
penalizing false positives and false negatives, thereby
ensuring more accurate segmentation results.
The study involved two primary methods: manual
segmentation and automatic segmentation. In manual
segmentation, trained radiologists manually
delineated the ischemic stroke lesions on the CT
scans. On the other hand, automatic segmentation
utilized the multi-scale U-Net deep network model to
segment the lesions automatically. The comparison
between these two methods revealed that the
automatic segmentation closely approximated the
manual segmentation, indicating the effectiveness of
the proposed model in accurately identifying
ischemic stroke lesions.
The "lesion area error" refers to the difference
between the segmented lesion areas obtained from
automatic segmentation and manual segmentation.
This metric provides insight into the accuracy of the
automatic segmentation method compared to the gold
EMITI 2024 - International Conference on Engineering Management, Information Technology and Intelligence
364
standard manual segmentation. A lower lesion area
error indicates higher accuracy in lesion delineation
by the automatic segmentation method.
The Pearson correlation coefficient is a statistical
measure used to assess the linear relationship between
two variables. In this context, it quantifies the degree
of correlation between the lesion areas obtained from
automatic segmentation and manual segmentation. A
Pearson correlation coefficient close to 1 indicates a
strong positive correlation, implying that the
automatic segmentation results closely align with the
manual segmentation results. Conversely, a
coefficient closer to 0 suggests a weaker correlation.
Overall, the study's findings demonstrate the
effectiveness of the multi-scale U-Net deep network
model in accurately segmenting ischemic stroke
lesions from non-enhanced CT scans. The
incorporation of the Dice loss function addresses data
imbalance issues, while the comparison between
manual and automatic segmentation methods
provides validation of the model's performance. The
lesion area error and Pearson correlation coefficient
serve as quantitative measures to evaluate the
accuracy and correlation of the automatic
segmentation results with manual segmentation,
further validating the model's utility in clinical
practice.
2.2.2 Deep Neural Network (DNN)
Cheon et al. conducted a comparative study
evaluating the effectiveness of Deep Neural
Networks (DNN) in predicting stroke risk factors
compared to five other machine-learning methods.
The analysis involved 11 variables, encompassing
factors such as gender, age, type of insurance,
admission model, need for brain surgery,
geographical region, length of hospital stays, hospital
location, number of hospital beds, stroke type, and
others. With a dataset comprising 15,099 subjects
with a history of stroke, the researchers employed a
combination of DNN and scaled Principal
Component Analysis (PCA) to automatically extract
features from the data and identify stroke risk factors.
The primary methodology employed deep neural
networks to analyze the relevant variables, enhancing
continuous inputs through scaled principal
component analysis. This innovative approach
yielded three key performance metrics: sensitivities,
specificities, and Area Under the Curve (AUC) values.
Sensitivity represents the proportion of true positive
cases correctly identified by the model, indicating its
ability to detect stroke cases accurately. Specificity
measures the proportion of true negative cases
correctly identified by the model, highlighting its
capacity to correctly identify non-stroke cases. AUC,
or the Area Under the Receiver Operating
Characteristic Curve, provides a comprehensive
assessment of the model's discriminative ability
across various thresholds, with higher values
indicating better overall performance in
distinguishing between stroke and non-stroke cases.
The reported values for sensitivities, specificities, and
AUC were 64.32%, 85.56%, and 83.48% (Cheon,
2019), respectively. These results suggest that the
DNN-based approach, supplemented by scaled PCA,
demonstrates promising potential for predicting
stroke and other diseases even when faced with
limited data. The relatively high specificity indicates
a low rate of false positives, while the AUC value
reflects the model's overall predictive accuracy,
underscoring its utility in clinical settings for
identifying individuals at risk of stroke.
2.2.3 Generative Adversarial Networks
(GNN)
Van Voorst et al. employed a method based on Graph
Neural Networks (GNN) with the aim of developing
and evaluating its effectiveness in segmenting infarct
and hemorrhagic stroke lesions on follow-up
Noncontrast Computed Tomography (NCCT) scans.
The paper utilized data from three Dutch acute
ischemic stroke trials, comprising 820 patients with
baseline and follow-up NCCT scans. Employing a
GNN, the researchers automated the segmentation of
infarct lesions from follow-up scans in acute ischemic
stroke patients. The results showcased moderate to
good performance in lesion segmentation, as
evidenced by Dice similarity coefficients ranging
from 0.31 to 0.59 (Van, Voorst). Notably, infarct
lesions observed at the 1-week follow-up exhibited
excellent volumetric correspondence. This
unsupervised approach holds promise for automated
lesion segmentation in clinical settings. Noncontrast
Computed Tomography (NCCT) scans, utilized for
follow-up assessments, provide detailed images
without the need for contrast agents, making them a
valuable tool in stroke diagnosis and monitoring.
3 DISCUSSIONS
Several major challenges are on the horizon in
machine learning for brain stroke prediction. First,
interpretability remains a major obstacle, as many
machine learning models are seen as esoteric black
boxes, making it challenging to understand their
The Comprehensive Investigation of Machine Learning-Based Patient Brain Stroke Prediction
365
decision-making mechanisms and placing significant
design pressure on decision-makers. Then it is also
very difficult to convince users and patients of the
reliability of the results. Therefore, it is necessary to
design for greater interpretability for decision-makers
and users. Second, privacy issues arise when training
models use personal sensitive data, raising concerns
about the potential exposure of users' private
information. In addition, the practicality of
implementing machine learning models in real-world
situations may be hindered by various factors such as
data quality issues, sparse labelling, or environmental
changes. Moreover, as models become more
complex, interpreting their predictions becomes more
challenging. Data quality and bias can significantly
affect the performance and robustness of machine
learning models, especially when faced with
unbalanced datasets or missing data. Sometimes, data
cannot be consistently achieved in machine learning
models. After changing scenarios or missing some
labels, obtaining the same data results and accurately
predicting outcomes becomes challenging.
Summarizing the necessary information and
achieving uniform results for diverse datasets
presents a significant challenge.
Looking ahead, potential solutions and avenues
for progress are emerging, for example, the Shapley
Addition Method of Interpretation (SHAP) approach,
designed as a novel and cutting-edge method, aims to
facilitate clinical interpretation and intuitive
comprehension of feature significance. It
accomplishes this by visualizing the relationship
between each feature and its associated predictive
power (Lundberg, 2020). The Federated Learning
(FL) approach provides a way to train models on
distributed data sources, improving model
performance while protecting user privacy. A wide
range of architectures based on Federated Learning,
as mentioned in (Yaqoob, 2023), have been
categorised as horizontal FL and vertical FL, and
many people have used diverse approaches to outline
the characteristics and results of some of the
optimisation strategies implemented by FL and to
discuss some of the expected business consequences
of federated learning. In addition, using the principles
of transfer learning, pre-trained models can be
migrated from one domain to another of interest,
thereby reducing data requirements and enhancing
model generalisation. comprehensively look to
compare and contrast several of the most widely
applicable machine learning methods, using a
combination of SHAP and FL to improve
interpretability and privacy and achieve optimal
solutions.
4 CONCLUSIONS
This work comprehensively discusses and compares
the advantages and disadvantages between various
traditional machine learning and deep learning on the
prediction of stroke in patients, obtaining the method
with the highest accuracy, and summarising the
relatively well-developed dataset available for
experiments. This paper mainly uses methods such as
RF, GB, and U-Net for screening to generate targeted
stroke prediction results synthetically. Some new
techniques have not been considered in this article,
such as large language models, time series models,
multimodal data fusion, and causal inference
methods, which will be added in the future to form a
more complete system for more thorough
consideration.
REFERENCES
Azam, M. S., Habibullah, M., & Rana, H. K. 2020.
Performance analysis of various machine learning
approaches in stroke prediction. International Journal
of Computer Applications, 175(21), 11-15.
Benzakoun, J., Charron, S., Turc, G., Hassen, W. B.,
Legrand, L., Boulouis, G., ... & Oppenheim, C. 2021.
Tissue outcome prediction in hyperacute ischemic
stroke: Comparison of machine learning models.
Journal of Cerebral Blood Flow & Metabolism, 41(11),
3085-3096.
Cheon, S., Kim, J., & Lim, J. 2019. The use of deep learning
to predict stroke patient mortality. International
journal of environmental research and public
health, 16(11), 1876.
Cox, A. P., Raluy-Callado, M., Wang, M., Bakheit, A. M.,
Moore, A. P., & Dinet, J. 2016. Predictive analysis for
identifying potentially undiagnosed post-stroke
spasticity patients in United Kingdom. Journal of
biomedical informatics, 60, 328-333.
Dev, S., Wang, H., Nwosu, C. S., Jain, N., Veeravalli, B.,
& John, D. 2022. A predictive analytics approach for
stroke prediction using machine learning and neural
networks. Healthcare Analytics, 2, 100032.
Fernandez-Lozano, C., Hervella, P., Mato-Abad, V.,
Rodríguez-Yáñez, M., Suárez-Garaboa, S., López-
Dequidt, I., ... & Iglesias-Rey, R. 2021. Random forest-
based prediction of stroke outcome. Scientific
reports, 11(1), 10071.
Kappelhof, N., Ramos, L. A., Kappelhof, M., van Os, H. J.,
Chalos, V., van Kranendonk, K. R., ... & Marquering,
H. A. 2021. Evolutionary algorithms and decision trees
for predicting poor outcome after endovascular
treatment for acute ischemic stroke. Computers in
Biology and Medicine, 133, 104414.
Li, S., Zheng, J., & Li, D. 2021. Precise segmentation of
non-enhanced computed tomography in patients with
ischemic stroke based on multi-scale U-Net deep
EMITI 2024 - International Conference on Engineering Management, Information Technology and Intelligence
366
network model. Computer methods and programs in
biomedicine, 208, 106278.
Liu, T., Fan, W., & Wu, C. 2019. A hybrid machine
learning approach to cerebral stroke prediction based on
imbalanced medical dataset. Artificial intelligence in
medicine, 101, 101723.
Lundberg, S. M., Erion, G., Chen, H., DeGrave, A., Prutkin,
J. M., Nair, B., ... & Lee, S. I. 2020. From local
explanations to global understanding with explainable
AI for trees. Nature machine intelligence, 2(1), 56-67.
Qiu, Y., Chang, C. S., Yan, J. L., Ko, L., & Chang, T. S.
2019. Semantic segmentation of intracranial
hemorrhages in head CT scans. In 2019 IEEE 10th
International Conference on Software Engineering and
Service Science (ICSESS) (pp. 112-115). IEEE.
Sirsat, M. S., Fermé, E., & Câmara, J. 2020. Machine
learning for brain stroke: a review. Journal of Stroke
and Cerebrovascular Diseases, 29(10), 105162.
Sun, G., Zhan, T., Owusu, B.G., Daniel, A.M., Liu, G., &
Jiang, W. 2020. Revised reinforcement learning based
on anchor graph hashing for autonomous cell activation
in cloud-RANs. Future Generation Computer Systems,
104, 60-73.
Süt, N., & Çelik, Y. 2012. Prediction of mortality in stroke
patients using multilayer perceptron neural
networks. Turkish Journal of Medical Sciences, 42(5),
886-893.
Van Voorst, H., Konduri, P. R., van Poppel, L. M., van der
Steen, W., van der Sluijs, P. M., Slot, E. M. H., ... &
Marquering, H. A. 2022. Unsupervised deep learning
for stroke lesion segmentation on follow-up CT based
on generative adversarial networks. American Journal
of Neuroradiology, 43(8), 1107-1114.
Wu, Y., Jin, Z., Shi, C., Liang, P., & Zhan, T. 2024.
Research on the Application of Deep Learning-based
BERT Model in Sentiment Analysis. arXiv preprint
arXiv:2403.08217.
Xie, Y., Jiang, B., Gong, E., Li, Y., Zhu, G., Michel, P., ...
& Zaharchuk, G. 2019. Use of gradient boosting
machine learning to predict patient outcome in acute
ischemic stroke on the basis of imaging, demographic,
and clinical information. American Journal of
Roentgenology, 212(1), 44-51.
Yaqoob, M. M., Nazir, M., Khan, M. A., Qureshi, S., & Al-
Rasheed, A. 2023. Hybrid classifier-based federated
learning in health service providers for cardiovascular
disease prediction. Applied Sciences, 13(3), 1911.
The Comprehensive Investigation of Machine Learning-Based Patient Brain Stroke Prediction
367