
with precision, which can help clinicians make more
informed decisions. AI aids in diagnosing and pre-
dicting the progression of stroke, improving treatment
response predictions, and supporting early interven-
tions that are crucial for stroke recovery and preven-
tion.
AI-driven predictive models have been designed
to learn from stroke data to forecast outcomes such
as mortality, functional impairment, and recovery po-
tential. ML models like support vector machines, ran-
dom forests, and neural networks have been employed
to predict key outcomes using structured clinical data.
These models not only provide personalized prog-
noses but also have the potential to improve patient
care by identifying high-risk individuals early. How-
ever, challenges remain in integrating these models
into clinical practice due to issues like small datasets
and poor reporting standards in existing studies.
For AI to become a trustworthy resource in stroke
care, transparency, reproducibility, and traceability
are essential. There is a growing demand for the
reproducibility of AI-based research, which is nec-
essary to ensure that models can be independently
validated and applied to different patient populations.
In this work, we are making the first step towards
providing such trustworthy resources for brain stroke
data.
Data and Code Availability: To ensure repro-
ducibility, we have made both the data and the code
used in our experiments publicly accessible., which
can be found at: https://github.com/DimitarTrajkov/
DataModel-Analyzer.
2 DATA AND METHOD
DESCRIPTION
In our study, we collected a total of 8 publicly avail-
able (tabular) datasets related to brain stroke: four
regression datasets and four classification datasets.
Of the classification datasets, two are binary classi-
fication datasets, and two address multi-class classi-
fication problems. Five of the datasets were found
at the repository Data.World, and 3 at the reposi-
tory Kaggle. Table 1 provides an overview of the
datasets used in this study. It includes the names of
the datasets, the number of instances, the number of
features, and specifies whether each dataset is used
for a classification (C) or regression (R) task.
We evaluated the performance of a broad spec-
trum of models implemented in the scikit-learn
toolbox (Pedregosa et al., 2011) to explore differ-
ent approaches to prediction and analysis. For the
classification datsets, we utilized the following dif-
ferent methods. First, we used ensemble meth-
ods, such as AdaBoostClassifier, BaggingClassi-
fier, RandomForestClassifier, GradientBoosting-
Classifier, XGBClassifier (from the XGBoost li-
brary), and LightGBMClassifier, for their ability
to improve predictive accuracy by combining mul-
tiple weak learners. These models are particularly
effective in capturing complex, non-linear relation-
ships in the data. We also incorporated linear models
like LogisticRegression, which are valued for their
interpretability and simplicity. Other classifiers in-
cluded DecisionTreeClassifier, KNeighborsClassi-
fier, MLPClassifier, QuadraticDiscriminantAnal-
ysis, RadiusNeighborsClassifier, SGDClassifier,
and SupportVectorClassifier (SVC), each contribut-
ing unique strengths to the classification tasks.
For the regression datasets, we also evalu-
ated a variety of models. Similarly as for the
classification datasets, we used different ensem-
ble methods such as AdaBoostRegressor, Baggin-
gRegressor, RandomForestRegressor, Gradient-
BoostingRegressor, HistGradientBoostingRegres-
sor, LightGBMRegressor, and XGBoostRegres-
sor (from the XGBoost library). Linear mod-
els, including LinearRegression, RidgeRegression,
LassoRegression, LassoLars, ElasticNetRegres-
sion, BayesianRidgeRegression, TheilSenRegres-
sor, HuberRegressor, RAN-SACRegressor, Pas-
siveAggressiveRegressor, SGDRegressor, Least-
AngleRegression, and OrthogonalMatchingPur-
suit, were employed for their simplicity and effec-
tiveness in datasets with linear relationships. Ad-
ditionally, GaussianProcessRegressor and KNeigh-
borsRegressor were included to capture local data
structures and model complex relationships, while
MLPRegressor was used for its deep learning capa-
bilities. Finally, we explored the performance some
specific regressors such as OrdinalRegression (from
the mord library) and TweedieRegressor.
3 DESIGN OF THE
EXPERIMENTAL STUDY
Figure 1 illustrates the design of the executed exper-
imental study. After identification and categorization
of relevant datasets and separating them into regres-
sion and classification tasks based on the target vari-
able, we manually examined each dataset to identify
those that required manual preprocessing. The pre-
processing steps included several standard procedures
applied across all datasets: removal of features with
constant values for all examples or missing values for
HEALTHINF 2025 - 18th International Conference on Health Informatics
632