The Comprehensive Investigation of Machine Learning-Based Patient

Brain Stroke Prediction

Xiuyuan Wei

Internet of Things, Concord University College Fujian Normal University, Fuzhou, Fujian, China

Keywords: Stroke Prediction, Machine Learning, Traditional Methods, Deep Learning.

Abstract: This paper aims to comprehensively review machine learning methodologies for stroke prediction, evaluating

both traditional and deep learning approaches, and discussing challenges and potential solutions in this

domain.

The paper conducts a thorough examination of machine learning methodologies for stroke prediction.

Traditional techniques are scrutinized for their efficacy in handling stroke prediction tasks across various

datasets. Deep learning approaches such as U-Net and Generative Adversarial Networks are also investigated

to assess their suitability and performance. Moreover, the review delves into the intricacies of these methods,

considering factors such as interpretability, privacy concerns, and data quality issues. Additionally, it explores

novel techniques such as the Shapley Addition Method of Interpretation and Federated Learning (FL) as

potential solutions to enhance interpretability and protect patient privacy. The review also examines the

potential of transfer learning to optimize model generalization across different domains, aiming to provide

insights into the most effective methodologies for stroke prediction. Findings suggest the promise of machine

learning in stroke prediction. Future research directions include integrating emerging techniques such as large

language models and multimodal data fusion for improved accuracy guiding researchers and practitioners in

selecting appropriate Machine Learning methods and addressing challenges in stroke prediction for enhanced

patient care.

1 INTRODUCTION

A stroke, a neurological condition, arises from

sudden, localized damage to the central nervous

system due to vascular problems, encompassing

cerebral infarction, Intracerebral hemorrhage (ICH)

(Qiu, 2020), and Subarachnoid hemorrhage (SAH). It

significantly impacts global disability and mortality

rates. Each year, stroke claims the lives of roughly 4.6

million individuals, amounting to about 9 percent of

global deaths. Beyond its lethal impact, stroke also

leads to substantial nonfatal health issues and

disabilities. While research from developed nations

shows that targeted interventions at individual,

community, and national levels can significantly

reduce the occurrence of stroke and related vascular

conditions, this knowledge has not been uniformly

implemented in developing regions. The frequency of

stroke differs markedly between populations, with

risk escalating with age or unhealthy lifestyle choices

(Azam, 2020). Within the contemporary clinical

https://orcid.org/0009-0003-2068-733X

realm, the diagnostic procedure for identifying stroke

is often marked by its laborious and ineffective

nature, frequently failing to efficiently manage time

and human resources while occasionally leading to

increased rates of misdiagnosis. Consequently, there

arises an urgent imperative to explore alternative

methodologies for stroke prediction, with a specific

emphasis on harnessing the capabilities of Artificial

Intelligence (AI) model due to their excellent

performance in many tasks (Sun, 2020; Wu, 2024).

This transition towards Al-based stroke prediction

represents a significant departure from traditional

diagnostic approaches and holds considerable

promise in augmenting the efficiency and precision of

diagnostic protocols, ultimately culminating in

enhanced patient outcomes and optimized allocation

of healthcare resources. In a recent study, researchers

utilized retrospective data to evaluate two distinct

algorithmic approaches: An algorithm validated

through statistical methods and another trained by

clinicians. Their findings revealed that the

362

Wei, X.

The Comprehensive Investigation of Machine Learning-Based Patient Brain Stroke Prediction.

DOI: 10.5220/0012938200004508

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 1st International Conference on Engineering Management, Information Technology and Intelligence (EMITI 2024), pages 362-367

ISBN: 978-989-758-713-9

implementation of Random Forest demonstrated

superior prediction accuracy compared to the

clinician-trained algorithms. Specifically, the

Random Forest model exhibited heightened

sensitivity and only marginally reduced specificity,

indicating its effectiveness in predicting outcomes

(Cox, 2016). Subsequently, researchers evaluated

three distinct supervised Machine Learning (ML)

models: Random Forests (RF), Gradient Boosting,

and U-Net. Of these ML models, Gradient Boosting

emerged as notably pertinent for forecasting tissue

outcomes following acute ischemic stroke (AIS),

closely trailed by Random Forests and U-Net in terms

of effectiveness (Benzakoun, 2021). In another study,

three prominent classification methods - Neural

Network (NN), Decision Tree (DT), and Random

Forest (RF) - were compared to predict stroke

occurrence based on patient attributes. All three

machine learning models were trained on a balanced

dataset comprising 28,524 patient records. The

analysis revealed that the features were not strongly

correlated, and it was observed that a combination of

merely four features could significantly contribute to

accurate stroke prediction (Dev, 2022). CT scans

serve as a commonly utilized dataset in the context of

stroke research and diagnosis (Sirsat, 2020),

Numerous research papers have adopted a hybrid

machine learning strategy to forecast stroke

occurrence using incomplete and unbalanced medical

datasets, recognizing the diverse implications of

employing different datasets in ML analysis (Liu,

2019).

This paper focuses on getting the current better

datasets and machine learning methods by examining

and discussing the various machine learning studies

conducted on different datasets. The paper proceeds

as outlined below: First, this paper will examine the

various machine learning methods employed in

literature to analyze and test diverse datasets in

section 2. Section 3 will delve into the insights

derived from numerous comparative analyses,

focusing on identifying and addressing dataset

challenges to enhance the efficacy of machine

learning analyses. Finally, Section 4 will provide a

summary of the paper and draw conclusions based on

the discussions presented in earlier sections.

2 METHOD

The framework of AI-based algorithms in stroke

prediction typically integrates both traditional

machine learning and deep learning approaches. It

initiates with comprehensive data collection,

gathering pertinent medical information including

demographic details, medical histories, and

diagnostic test results. Subsequently, the collected

data undergoes meticulous preprocessing to ensure its

quality and consistency, facilitating the subsequent

analysis. Leveraging a hybrid approach, the model-

building phase ensues, where both traditional

machine learning algorithms and deep learning

architectures are utilized for constructing predictive

models. This entails selecting and extracting relevant

features from the preprocessed data, a crucial step in

enhancing model performance. Following model

construction, rigorous training ensues to optimize the

models' predictive capabilities, leveraging both

machine learning and deep learning techniques to

capture complex patterns within the data.

Subsequently, the models undergo thorough testing

using established evaluation metrics to assess their

effectiveness in stroke prediction tasks. Upon

successful validation, the models are poised for

deployment, the most important is Continuous

refinement and improvement.

2.1 Traditional Machine Learning

2.1.1 Random Forest

Fernandez-Lozano, et al. focus on Random Forest-

based Stroke Outcome Prediction (Fernandez-Lozano,

2021), identified through a literature review

highlighting the superior performance of Random

Forests in biomedical applications. Data were

collected from 6022 patients, categorized into

Ischemic Stroke (IS) and Intracerebral Hemorrhage

(ICH) and IS+ ICH groups. After excluding certain

patients, the final data set was prepared. The model

underwent training using ten-fold cross-validation

along with 100 repeated randomizations to ensure

robustness and reliability. Analytical forecasts were

generated concerning both mortality and morbidity

among patient groups diagnosed with ischemic stroke

(IS), intracerebral hemorrhage, or a combination of

both (IS+ICH). The aim was to determine the primary

predictors of machine learning models, particularly

Random Forest, for generating predictive models.

2.1.2 Gradient Boosting

Xie, Yuan, et al. utilized an extreme gradient boosting

model to examine 512 patients, aiming to forecast the

Modified Rankin Score (MRS) at 90 days based on

biomarkers accessible upon admission and within 24

hours. The method employed a greedy algorithm for

feature selection and assessed model performance

The Comprehensive Investigation of Machine Learning-Based Patient Brain Stroke Prediction

363

through five-fold cross-validation. The results

indicate that decision tree-based gradient boosting

models exhibit high Area Under the Curve (AUC) in

predicting stroke patient recovery outcomes upon

admission. Additionally, stratifying patient groups

based on recanalization status may offer insights

beneficial for treatment decision-making processes

(Xie, 2019).

2.1.3 Decision Trees

Kappelhof, et al. introduce a novel algorithm that

employs an evolutionary approach to develop

decision trees that are both interpretable and powerful

for predicting adverse outcomes following

Endovascular therapy for acute ischemic stroke

(Kappelhof, 2021). Utilizing 5-fold cross-validation,

the training cohort comprised an average of 1090

patients, while the validation cohort encompassed

273 patients, achieving an average accuracy rate of

72%. In this decision tree, decision nodes contain

split-ranges rather than split-values, which are

mathematically defined Employing the notion of

belongingness and segmented linear membership

functions. The algorithm's primary aim is to balance

constraining the size of the tree to maintain

interpretability while optimizing prediction accuracy

on unseen data. The test algorithm underwent

improvement by integrating the function into the

operation of the evolutionary algorithm. Initially,

decision trees were generated using the grow method

as part of the initialization process. Following this,

the selection of individuals to advance to the

crossover phase and create the subsequent generation

took place. The common one-point crossover metho

was utilized during this phase. Furthermore, the

mutation phase was implemented to introduce

variability, with incorrect trees undergoing pruning.

Additionally, imputation and experimental setup

were integrated into the process. Finally, on average,

the fuzzy algorithm converged to its final solution

within the initial hour of execution.

2.1.4 Neural Networks

Süt et al. conducted a comprehensive study utilizing

Multilayer Perceptron (MLP) neural networks to

predict mortality in stroke patients, incorporating a

dataset of 584 individuals and examining various

prognostic factors. Six distinct MLP algorithms were

employed: Quick Propagation (QP), Levenberg-

Marquardt (LM), Backpropagation (BP), Quasi-

Newton (QN), Delta Bar Delta (DBD), and Conjugate

Gradient Descent (CGD) (Süt, 2012). The QP

algorithm, despite its potential instability, showcased

remarkable efficiency in weight adjustment

computation, yielding the highest performance

metrics including specificity, sensitivity, accuracy,

and area under the curve (AUC). LM, utilizing a least

squares estimation method, showed reasonable

performance but fell short of QP in predictive

accuracy. BP, a widely used technique, demonstrated

inferior performance compared to QP despite its

simplicity. QN, an advanced training method, did not

surpass QP in predictive accuracy despite

approximating the inverse Hessian matrix for error

gradient calculation. DBD, an alternative to BP,

exhibited promising results but did not outperform

QP. CGD, employing iterative error gradient and

search direction calculations, displayed the lowest

predictive accuracy. Overall, the study underscores

the pivotal role of algorithm selection in MLP

modelling and highlights the potential efficacy of QP-

trained models in clinical mortality prediction.

2.2 Deep Learning

2.2.1 U-Net

Li et al. conducted a study aimed at improving care

for patients with ischemic stroke by utilizing a

sophisticated multi-scale U-Net deep network model

(Li, 2021). This model was employed to segment

image features extracted from non-enhanced

computed tomography (CT) scans of 30 stroke

patients. To address the challenge of data imbalance

during model training, the authors incorporated the

Dice loss function, a metric commonly used in

medical image segmentation tasks. This function

helps in optimizing the model's performance by

penalizing false positives and false negatives, thereby

ensuring more accurate segmentation results.

The study involved two primary methods: manual

segmentation and automatic segmentation. In manual

segmentation, trained radiologists manually

delineated the ischemic stroke lesions on the CT

scans. On the other hand, automatic segmentation

utilized the multi-scale U-Net deep network model to

segment the lesions automatically. The comparison

between these two methods revealed that the

automatic segmentation closely approximated the

manual segmentation, indicating the effectiveness of

the proposed model in accurately identifying

ischemic stroke lesions.

The "lesion area error" refers to the difference

between the segmented lesion areas obtained from

automatic segmentation and manual segmentation.

This metric provides insight into the accuracy of the

automatic segmentation method compared to the gold

EMITI 2024 - International Conference on Engineering Management, Information Technology and Intelligence

364

standard manual segmentation. A lower lesion area

error indicates higher accuracy in lesion delineation

by the automatic segmentation method.

The Pearson correlation coefficient is a statistical

measure used to assess the linear relationship between

two variables. In this context, it quantifies the degree

of correlation between the lesion areas obtained from

automatic segmentation and manual segmentation. A

Pearson correlation coefficient close to 1 indicates a

strong positive correlation, implying that the

automatic segmentation results closely align with the

manual segmentation results. Conversely, a

coefficient closer to 0 suggests a weaker correlation.

Overall, the study's findings demonstrate the

effectiveness of the multi-scale U-Net deep network

model in accurately segmenting ischemic stroke

lesions from non-enhanced CT scans. The

incorporation of the Dice loss function addresses data

imbalance issues, while the comparison between

manual and automatic segmentation methods

provides validation of the model's performance. The

lesion area error and Pearson correlation coefficient

serve as quantitative measures to evaluate the

accuracy and correlation of the automatic

segmentation results with manual segmentation,

further validating the model's utility in clinical

practice.

2.2.2 Deep Neural Network (DNN)

Cheon et al. conducted a comparative study

evaluating the effectiveness of Deep Neural

Networks (DNN) in predicting stroke risk factors

compared to five other machine-learning methods.

The analysis involved 11 variables, encompassing

factors such as gender, age, type of insurance,

admission model, need for brain surgery,

geographical region, length of hospital stays, hospital

location, number of hospital beds, stroke type, and

others. With a dataset comprising 15,099 subjects

with a history of stroke, the researchers employed a

combination of DNN and scaled Principal

Component Analysis (PCA) to automatically extract

features from the data and identify stroke risk factors.

The primary methodology employed deep neural

networks to analyze the relevant variables, enhancing

continuous inputs through scaled principal

component analysis. This innovative approach

yielded three key performance metrics: sensitivities,

specificities, and Area Under the Curve (AUC) values.

Sensitivity represents the proportion of true positive

cases correctly identified by the model, indicating its

ability to detect stroke cases accurately. Specificity

measures the proportion of true negative cases

correctly identified by the model, highlighting its

capacity to correctly identify non-stroke cases. AUC,

or the Area Under the Receiver Operating

Characteristic Curve, provides a comprehensive

assessment of the model's discriminative ability

across various thresholds, with higher values

indicating better overall performance in

distinguishing between stroke and non-stroke cases.

The reported values for sensitivities, specificities, and

AUC were 64.32%, 85.56%, and 83.48% (Cheon,

2019), respectively. These results suggest that the

DNN-based approach, supplemented by scaled PCA,

demonstrates promising potential for predicting

stroke and other diseases even when faced with

limited data. The relatively high specificity indicates

a low rate of false positives, while the AUC value

reflects the model's overall predictive accuracy,

underscoring its utility in clinical settings for

identifying individuals at risk of stroke.

2.2.3 Generative Adversarial Networks

(GNN)

Van Voorst et al. employed a method based on Graph

Neural Networks (GNN) with the aim of developing

and evaluating its effectiveness in segmenting infarct

and hemorrhagic stroke lesions on follow-up

Noncontrast Computed Tomography (NCCT) scans.

The paper utilized data from three Dutch acute

ischemic stroke trials, comprising 820 patients with

baseline and follow-up NCCT scans. Employing a

GNN, the researchers automated the segmentation of

infarct lesions from follow-up scans in acute ischemic

stroke patients. The results showcased moderate to

good performance in lesion segmentation, as

evidenced by Dice similarity coefficients ranging

from 0.31 to 0.59 (Van, Voorst). Notably, infarct

lesions observed at the 1-week follow-up exhibited

excellent volumetric correspondence. This

unsupervised approach holds promise for automated

lesion segmentation in clinical settings. Noncontrast

Computed Tomography (NCCT) scans, utilized for

follow-up assessments, provide detailed images

without the need for contrast agents, making them a

valuable tool in stroke diagnosis and monitoring.

3 DISCUSSIONS

Several major challenges are on the horizon in

machine learning for brain stroke prediction. First,

interpretability remains a major obstacle, as many

machine learning models are seen as esoteric black

boxes, making it challenging to understand their

The Comprehensive Investigation of Machine Learning-Based Patient Brain Stroke Prediction

365

decision-making mechanisms and placing significant

design pressure on decision-makers. Then it is also

very difficult to convince users and patients of the

reliability of the results. Therefore, it is necessary to

design for greater interpretability for decision-makers

and users. Second, privacy issues arise when training

models use personal sensitive data, raising concerns

about the potential exposure of users' private

information. In addition, the practicality of

implementing machine learning models in real-world

situations may be hindered by various factors such as

data quality issues, sparse labelling, or environmental

changes. Moreover, as models become more

complex, interpreting their predictions becomes more

challenging. Data quality and bias can significantly

affect the performance and robustness of machine

learning models, especially when faced with

unbalanced datasets or missing data. Sometimes, data

cannot be consistently achieved in machine learning

models. After changing scenarios or missing some

labels, obtaining the same data results and accurately

predicting outcomes becomes challenging.

Summarizing the necessary information and

achieving uniform results for diverse datasets

presents a significant challenge.

Looking ahead, potential solutions and avenues

for progress are emerging, for example, the Shapley

Addition Method of Interpretation (SHAP) approach,

designed as a novel and cutting-edge method, aims to

facilitate clinical interpretation and intuitive

comprehension of feature significance. It

accomplishes this by visualizing the relationship

between each feature and its associated predictive

power (Lundberg, 2020). The Federated Learning

(FL) approach provides a way to train models on

distributed data sources, improving model

performance while protecting user privacy. A wide

range of architectures based on Federated Learning,

as mentioned in (Yaqoob, 2023), have been

categorised as horizontal FL and vertical FL, and

many people have used diverse approaches to outline

the characteristics and results of some of the

optimisation strategies implemented by FL and to

discuss some of the expected business consequences

of federated learning. In addition, using the principles

of transfer learning, pre-trained models can be

migrated from one domain to another of interest,

thereby reducing data requirements and enhancing

model generalisation. comprehensively look to

compare and contrast several of the most widely

applicable machine learning methods, using a

combination of SHAP and FL to improve

interpretability and privacy and achieve optimal

solutions.

4 CONCLUSIONS

This work comprehensively discusses and compares

the advantages and disadvantages between various

traditional machine learning and deep learning on the

prediction of stroke in patients, obtaining the method

with the highest accuracy, and summarising the

relatively well-developed dataset available for

experiments. This paper mainly uses methods such as

RF, GB, and U-Net for screening to generate targeted

stroke prediction results synthetically. Some new

techniques have not been considered in this article,

such as large language models, time series models,

multimodal data fusion, and causal inference

methods, which will be added in the future to form a

more complete system for more thorough

consideration.

REFERENCES

Azam, M. S., Habibullah, M., & Rana, H. K. 2020.

Performance analysis of various machine learning

approaches in stroke prediction. International Journal

of Computer Applications, 175(21), 11-15.

Benzakoun, J., Charron, S., Turc, G., Hassen, W. B.,

Legrand, L., Boulouis, G., ... & Oppenheim, C. 2021.

Tissue outcome prediction in hyperacute ischemic

stroke: Comparison of machine learning models.

Journal of Cerebral Blood Flow & Metabolism, 41(11),

3085-3096.

Cheon, S., Kim, J., & Lim, J. 2019. The use of deep learning

to predict stroke patient mortality. International

journal of environmental research and public

health, 16(11), 1876.

Cox, A. P., Raluy-Callado, M., Wang, M., Bakheit, A. M.,

Moore, A. P., & Dinet, J. 2016. Predictive analysis for

identifying potentially undiagnosed post-stroke

spasticity patients in United Kingdom. Journal of

biomedical informatics, 60, 328-333.

Dev, S., Wang, H., Nwosu, C. S., Jain, N., Veeravalli, B.,

& John, D. 2022. A predictive analytics approach for

stroke prediction using machine learning and neural

networks. Healthcare Analytics, 2, 100032.

Fernandez-Lozano, C., Hervella, P., Mato-Abad, V.,

Rodríguez-Yáñez, M., Suárez-Garaboa, S., López-

Dequidt, I., ... & Iglesias-Rey, R. 2021. Random forest-

based prediction of stroke outcome. Scientific

reports, 11(1), 10071.

Kappelhof, N., Ramos, L. A., Kappelhof, M., van Os, H. J.,

Chalos, V., van Kranendonk, K. R., ... & Marquering,

H. A. 2021. Evolutionary algorithms and decision trees

for predicting poor outcome after endovascular

treatment for acute ischemic stroke. Computers in

Biology and Medicine, 133, 104414.

Li, S., Zheng, J., & Li, D. 2021. Precise segmentation of

non-enhanced computed tomography in patients with

ischemic stroke based on multi-scale U-Net deep

EMITI 2024 - International Conference on Engineering Management, Information Technology and Intelligence

366

network model. Computer methods and programs in

biomedicine, 208, 106278.

Liu, T., Fan, W., & Wu, C. 2019. A hybrid machine

learning approach to cerebral stroke prediction based on

imbalanced medical dataset. Artificial intelligence in

medicine, 101, 101723.

Lundberg, S. M., Erion, G., Chen, H., DeGrave, A., Prutkin,

J. M., Nair, B., ... & Lee, S. I. 2020. From local

explanations to global understanding with explainable

AI for trees. Nature machine intelligence, 2(1), 56-67.

Qiu, Y., Chang, C. S., Yan, J. L., Ko, L., & Chang, T. S.

2019. Semantic segmentation of intracranial

hemorrhages in head CT scans. In 2019 IEEE 10th

International Conference on Software Engineering and

Service Science (ICSESS) (pp. 112-115). IEEE.

Sirsat, M. S., Fermé, E., & Câmara, J. 2020. Machine

learning for brain stroke: a review. Journal of Stroke

and Cerebrovascular Diseases, 29(10), 105162.

Sun, G., Zhan, T., Owusu, B.G., Daniel, A.M., Liu, G., &

Jiang, W. 2020. Revised reinforcement learning based

on anchor graph hashing for autonomous cell activation

in cloud-RANs. Future Generation Computer Systems,

104, 60-73.

Süt, N., & Çelik, Y. 2012. Prediction of mortality in stroke

patients using multilayer perceptron neural

networks. Turkish Journal of Medical Sciences, 42(5),

886-893.

Van Voorst, H., Konduri, P. R., van Poppel, L. M., van der

Steen, W., van der Sluijs, P. M., Slot, E. M. H., ... &

Marquering, H. A. 2022. Unsupervised deep learning

for stroke lesion segmentation on follow-up CT based

on generative adversarial networks. American Journal

of Neuroradiology, 43(8), 1107-1114.

Wu, Y., Jin, Z., Shi, C., Liang, P., & Zhan, T. 2024.

Research on the Application of Deep Learning-based

BERT Model in Sentiment Analysis. arXiv preprint

arXiv:2403.08217.

Xie, Y., Jiang, B., Gong, E., Li, Y., Zhu, G., Michel, P., ...

& Zaharchuk, G. 2019. Use of gradient boosting

machine learning to predict patient outcome in acute

ischemic stroke on the basis of imaging, demographic,

and clinical information. American Journal of

Roentgenology, 212(1), 44-51.

Yaqoob, M. M., Nazir, M., Khan, M. A., Qureshi, S., & Al-

Rasheed, A. 2023. Hybrid classifier-based federated

learning in health service providers for cardiovascular

disease prediction. Applied Sciences, 13(3), 1911.

The Comprehensive Investigation of Machine Learning-Based Patient Brain Stroke Prediction

367