Metaheuristics Applied to Optimal Feature Selection for Accurate

Predictive Models in Smart Health: A Case Study on Hypotension

Prediction in Hemodialysis Patients

María Santamera-Lastras

1,2 a

, Felipe Cisternas-Caneo

3 b

, José Carlos Barrera-García

3 c

Broderick Crawford

3 d

, Alberto Garcés-Jiménez

2,4,5 e

, Diego Rodríguez-Puyol

1,5 f

and José Manuel Gómez-Pulido

2,4,5 g

Dept. of Medicine and Medical Specialties, Universidad de Alcalá, Madrid, Spain

Health Computing and Intelligent Systems Research Group (HCIS), Universidad de Alcalá, Madrid, Spain

Escuela de Ingeniería Informática, Pontiﬁcia Universidad Católica de Valparaíso, Valparaíso, Chile

Dept. of Computer Science, Universidad de Alcalá, Madrid, Spain

Ramón y Cajal Institute for Health Research (IRYCIS), Madrid, Spain

Keywords:

Intradialytic Hypotension, Metaheuristic, Predictive Learning, Dimensionality Reduction, Selection of

Characteristics.

Abstract:

Predicting potential hypotensive episodes in chronic kidney disease patients before dialysis is crucial for pre-

venting complications and ensuring effective treatment. This study explores the use of metaheuristic algo-

rithms to optimize the complex task of selecting the feature set needed to develop a highly accurate predictive

machine learning model for detecting hypotension, based on clinical parameters from the dialyzer and ana-

lytical data from blood tests. Metaheuristic algorithms offer a robust approach to optimal variable selection

and subsequent dimensionality reduction, leading to more accurate machine learning predictor models. In

this context, two relevant metaheuristic algorithms were employed: Particle Swarm Optimization (PSO) and

Grey Wolf Optimizer (GWO), along with the supervised machine learning algorithm XGBoost. The results

demonstrate that the application of metaheuristic techniques not only reduces the feature count from 67 to 36

variables but also improves classiﬁer performance, thereby enhancing the prediction of hypotensive events.

Speciﬁcally, the optimized model achieved an Area Under the Curve (AUC) of 0.76 and a recall of 0.764 for

the minority class (hypotensive episodes) in chronic kidney disease patients prior to hemodialysis procedures.

1 INTRODUCTION

1.1 The Challenge of Early Detection of

Intradialytic Hypotension

Dialysis and transplantation have allowed many pa-

tients with advanced chronic kidney disease (CKD) to

live longer and maintain a good quality of life. Over

https://orcid.org/0009-0006-2323-192X

https://orcid.org/0000-0001-7723-7012

https://orcid.org/0009-0007-9934-5250

https://orcid.org/0000-0001-5500-0188

https://orcid.org/0000-0002-1365-9280

https://orcid.org/0000-0002-9125-9311

https://orcid.org/0000-0002-6897-8262

the last decade, the prevalence of advanced CKD has

increased by nearly 30% in Spain. As a result, in

2021, 65,740 patients required dialysis to stay alive,

with 26,690 of them undergoing hemodialysis (HD)

techniques (Sociedad Española de Nefrología, 2023).

Intradialytic hypotension (IDH) is one of the most

common adverse events during hemodialysis (HD)

sessions. Despite signiﬁcant advances in HD tech-

niques in recent decades that have improved its man-

agement, IDH remains prevalent, occurring in 5-40%

of sessions (Flythe et al., 2020). This condition is as-

sociated with high morbidity and mortality rates, as

well as multiple complications. Currently, there are

no standardized guidelines or systematic treatments

for managing IDH; various maneuvers are employed

to control it (Furaz Czerpak et al., 2014). IDH is char-

Santamera-Lastras, M., Cisternas Caneo, F., Barrera García, J. C., Crawford, B., Garcéz-Jiménez, A., Rodríguez Puyol, D. and Gómez Pulido, J. M.

Metaheuristics Applied to Optimal Feature Selection for Accurate Predictive Models in Smart Health: A Case Study on Hypotension Prediction in Hemodialysis Patients.

DOI: 10.5220/0013164000003911

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 18th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2025) - Volume 1, pages 563-570

ISBN: 978-989-758-731-3; ISSN: 2184-4305

563

acterized by three essential components: a drop of

more than 20 mmHg in systolic blood pressure (SBP)

or more than 10 mmHg in mean arterial pressure

(MAP), the presence of ischemic symptoms in vari-

ous organs, and interventions by dialysis staff (Agar-

wal, 2012).

Identifying the occurrence of intradialytic hy-

potension (IDH) during hemodialysis (HD) treatment

is challenging due to the numerous variables involved,

including the type of dialyzer, dialyzer temperature,

patient characteristics, dialysis modality, and medical

criteria. Therefore, pinpointing the factors that most

signiﬁcantly inﬂuence the occurrence of IDH and pre-

dicting its onset would provide valuable decision sup-

port for medical staff.

1.2 Prediction of IDH Using Machine

Learning Techniques

Determining whether a patient experiences intradia-

lytic hypotension (IDH) during hemodialysis (HD)

treatment is a complex task for traditional statistical

models. However, machine learning (ML) models

can effectively discover patterns in the data, making

them particularly well-suited for this problem (Nay-

yar et al., 2021). This work aims to identify an op-

timal ML-based model to predict the likelihood of

IDH at the start of HD treatment. To achieve this,

metaheuristic techniques will be employed to select

the best combination of variables from the patient’s

clinical and analytical blood records, which contain

numerous characteristics, to reduce noise, improve

model performance, and enhance the accuracy of the

ML model.

Recent research has developed predictive mod-

els that utilize machine learning algorithms and opti-

mization techniques to forecast intradialytic hypoten-

sion (IDH) episodes with high accuracy. These mod-

els empower healthcare professionals to anticipate

and prevent IDH, thereby enhancing patient safety

and well-being during hemodialysis treatments. By

combining machine learning with optimization tech-

niques, signiﬁcant progress has been made in man-

aging and preventing health complications. Various

studies have proposed different methodologies to pre-

dict and manage IDH episodes, aiming to improve pa-

tient outcomes and enhance clinical decision-making.

For instance, Hong et al. (Hong et al., 2023) in-

troduced a software-based artiﬁcial intelligence alert

system to identify patients at high risk of IDH before

hemodialysis sessions. Similarly, Gervasoni et al.

(Gervasoni et al., 2023) developed and validated two

AI-based risk models to predict symptomatic IDH

at different time points. Othman et al. (Othman

et al., 2022) focused on early alert systems, employ-

ing AI and machine learning techniques for IDH pre-

diction prior to the initiation of hemodialysis. In an-

other approach, Yang et al. (Yang et al., 2022) pre-

sented a hybrid model, BSCWJAYA-KELM, which

integrates serum biomarkers with a novel optimiza-

tion algorithm to enhance prediction accuracy. Addi-

tionally, Chen et al. (Chen et al., 2020) investigated

clinical factors associated with IDH using deep learn-

ing, while Lee et al. (Lee et al., 2023) developed

a model based on pre-dialysis features for IDH pre-

diction. Zhang et al. (Zhang et al., 2023) utilized a

machine learning model to predict IDH in in-center

hemodialysis patients 15 to 75 minutes in advance.

Kim et al. (Kim et al., 2022) introduced a deep learn-

ing model that relies solely on monitoring measure-

ments from hemodialysis machines, and Mendoza-

Pittí et al. (Mendoza-Pittí et al., 2022) created a ma-

chine learning model capable of predicting IDH at the

start of hemodialysis sessions. Finally, Gómez-Pulido

et al. (Gómez-Pulido et al., 2021) predicted hypoten-

sion using models trained on a large dialysis database.

These studies collectively enhance the under-

standing of intradialytic hypotension (IDH) predic-

tion and management through the use of machine

learning algorithms and AI techniques. Their aim is

to identify high-risk patients early, facilitating timely

interventions to prevent acute hypotension during

hemodialysis and ultimately improving patient out-

comes.

1.3 Optimal Feature Selection

Technique Based on Metaheuristics

In the healthcare ﬁeld, innovative and efﬁcient so-

lutions to complex problems are crucial (Haupt and

Haupt, 2004). Metaheuristics, which include algo-

rithms such as genetic algorithms, particle swarm op-

timization, and ant colony optimization, have signif-

icant applications in this sector. These algorithms

iteratively search for optimal solutions by initializ-

ing a population of candidate solutions and evaluating

their ﬁtness according to an objective function. Based

on this evaluation, they generate new populations by

modifying the existing ones to enhance overall ﬁtness.

This process, illustrated in Fig. 1, continues until a

stopping criterion is met.

These techniques are employed to optimize medi-

cal treatments, enhance hospital scheduling, and facil-

itate the analysis of large medical datasets. The ability

of metaheuristics to identify near-optimal solutions

within a reasonable time frame makes them ideal for

addressing healthcare challenges, where swift and ac-

curate decisions can signiﬁcantly impact patient out-

BIOINFORMATICS 2025 - 16th International Conference on Bioinformatics Models, Methods and Algorithms

564

comes (Rahul and Banyal, 2022; Aljohani, 2024).

The Feature Selection (FS) problem is a multi-

objective combinatorial optimization challenge that

arises from the need to eliminate irrelevant and redun-

dant information from datasets used in machine learn-

ing training. Such irrelevant information can hinder

the performance of prediction or classiﬁcation algo-

rithms.

In its mathematics deﬁnition, FS assumes a

dataset O such as the original dataset, which con-

tains a o number of f features, such that O =

{ f

, f

, ..., f

}. The objective of the prob-

lem is to select the best subset of features B =

{ f

, f

, ..., f

} with b < o so that the features be-

longing to the selected subset are the most represen-

tative of the set of original data. (Ebiaredoh-Mienye

et al., 2022; Pudjihartono et al., 2022)

Internally, feature selection operates within a bi-

nary domain, where solutions are represented by ones

and zeros. A one indicates that a feature is included

in the data subset, while a zero signiﬁes its exclu-

sion. Fig. 2 illustrates this representation for the fea-

ture selection problem. In this example, the original

dataset comprises six features, resulting in the solu-

tion [1, 1, 1,1, 1, 1]. After applying the metaheuristics,

the optimal subset of features, which includes fea-

tures f

, f

, and f

, is identiﬁed, yielding the solution

[1, 0, 1, 1, 0, 0].

The objective function to optimize within the

feature selection problem is represented in differ-

ent ways, but the most used is the weighted multi-

objective function (Barrera-García et al., 2023). In

this way, the objective function Z is calculated as the

following equation (1):

min Z = α · metric + β ·

|R|

|N|

(1)

where metric corresponds to a performance metric

obtained from the machine learning technique, |R| is

the number of selected features, and |N| the total num-

ber of features, with α ∈ [0, 1] and β = (1−α) param-

eters that regulate the importance of the quality of the

results and the size of the subset, respectively.

Figure 1: Flowchart of the metaheuristic algorithm process.

1 1 1 1 1 1

1 0 1 1 0 0

After Feature

Selection with

metaheuristics

Original Dataset Best Subset

Figure 2: Binary representation of vectors S and D.

1.4 Hypothesis

It is possible to detect an optimal combination of clin-

ical and analytical parameters associated with the de-

velopment of IDH.

By analyzing a patient’s previous analytical val-

ues and relevant data, along with measurements of

speciﬁc clinical parameters at the beginning of a

hemodialysis (HD) session, it may be possible to pre-

dict the occurrence of an intradialytic hypotension

(IDH) episode. This could help reduce its incidence

and enable early intervention by medical staff. The re-

search aims to extract and construct a subset of signif-

icantly relevant variables from a large database of HD

sessions and blood tests. This subset will be suitable

for processing with mass data tools and will facilitate

the prediction of IDH based on patterns derived from

an optimal and reduced set of clinical and analytical

parameters.

2 MATERIALS AND METHODS

2.1 Digitized Clinical Database of

Hemodialysis Patients

Our research is based on two extensive databases con-

taining medical information from 758 patients under-

going hemodialysis sessions at Hospital Príncipe de

Asturias in Madrid, Spain, over nearly four years,

from January 1, 2016, to October 30, 2019. The ﬁrst

database includes clinical data for 98,015 hemodialy-

sis sessions, each lasting about ﬁve hours, and com-

prises patient and session identiﬁcation data along

with 180 clinical parameters automatically recorded

by the dialyzer during the sessions. The second

database contains blood analysis data for the same pa-

tients and period, featuring 141 variables arising from

routine blood tests performed at varying intervals.

To create the dataset for this study, the two

databases were merged such that each hemodialysis

session was linked to both the session’s clinical data

and the results of the blood test closest in time—either

on the same date or immediately prior. In total, this

Metaheuristics Applied to Optimal Feature Selection for Accurate Predictive Models in Smart Health: A Case Study on Hypotension

Prediction in Hemodialysis Patients

565

merging process resulted in a new dataset contain-

ing 221 variables related to patient information, dial-

ysis sessions, and blood tests. The collection of these

records was approved by the hospital’s ethics commit-

tee, with all data fully anonymized.

2.2 Database of HD Sessions Indicating

IDH Events

In the integrated database, HD sessions in which

the IDH event is determined by medical criteria are

marked. For this purpose, it is determined from the

systolic arterial pressures measured in each of the 5

hours of the session whether the difference between

any of these pressures measured during the session

is at least 20 mmHg lower than the pressures mea-

sured at the beginning of the session and at the ﬁrst

hour. Each HD session is assigned a new binary vari-

able, IDH, that takes the value of "1" if there is a hy-

potension event and "0" if there is not. Finally, a new

database, shown in Fig. 3, is formed containing the

analytical and clinical variables associated with each

session, keeping only the variables associated with the

start of the HD session, the so-called "Hour 0".

This new database supports the goal of developing

a predictive system that can accurately anticipate po-

tential hypotensive episodes during dialysis sessions.

2.3 Data Processing and Variables

Considered in the Clinical Study

After obtaining the integrated database, the medical

staff, using their experience and clinical knowledge,

determined a set of 67 variables to be considered for

the predictive model to be developed. Subsequently,

and as is usual in data science work, it is necessary

to pre-process the data before starting the modeling

phases. In the present case, after the tasks of integrat-

ing and combining data from different sources, clean-

ing tasks have been carried out, identifying and cor-

recting possible errors in the data. For this purpose,

an analysis of the variables to be analyzed was carried

out, detecting possible quality problems in the data,

and cleaning tasks were carried out in order to have

a set of consistent data, correcting incorrect, atypical,

or missing data. This data analysis process detected

that many dialysis sessions or analytical tests lacked

quality data. Therefore, the medical team opted to

keep the largest number of variables to be considered

for the models and to eliminate patients and dialy-

sis sessions that lacked quality data so that ﬁnally,

the data set used to run the predictive model consid-

ered 328 patients and a set of 68,574 sessions, which

constitutes 69.96% and of which 19,810 sessions had

IDH, representing 28.89% of the total number of HD

sessions considered after the data cleaning process

(Fig. 4). Both the patient cohort size and the number

of HD sessions are representative of the population

under study and allow for adequate modeling of the

predictor system to be developed.

2.4 Enhanced Prediction of IDH

Through Machine Learning and

Biomarker Analysis

2.4.1 Machine Learning Model

To predict IDH, an XGBoost model was employed,

selected for its superior performance in preliminary

experiments compared to various methods, including

KNN, Random Forest, and LGBM, as well as serving

as a representative example of decision tree-based ap-

proaches. Initial hyper-parameter tuning was tailored

to our dataset, which includes 67 features.

A 5-fold cross-validation technique was imple-

mented to validate the model, ensuring robust perfor-

mance evaluation.

Given the dataset’s class imbalance, with only

28.89% of sessions representing the minority class

(hypotensive patients), a data balancing technique

called RandomUnderSampler was employed. This

method was chosen based on previous experiments

that demonstrated its superiority over RandomOver-

Sampler and NearMiss techniques. The imbalance ra-

tio was adjusted to 1:1, equalizing the number of sam-

ples in both the majority and minority classes, thereby

improving the model’s predictive accuracy.

2.4.2 Feature Selection with Metaheuristics

For feature selection, two representative metaheuris-

tic algorithms were employed: Particle Swarm Op-

timization (PSO) and Grey Wolf Optimizer (GWO).

PSO was selected as a classic metaheuristic model

known for its effectiveness in solving optimization

problems by simulating social behavior patterns. In

contrast, GWO, inspired by the social hierarchy and

hunting behavior of grey wolves, is recognized for its

ability to balance exploration and exploitation during

the optimization process. Both algorithms were cho-

sen for their proven capability to efﬁciently navigate

large search spaces and identify relevant features, thus

enhancing the performance of the predictive model.

The binarization of solutions obtained from the

metaheuristics was necessary to convert continuous

variables into binary variables due to the discrete na-

ture of the feature selection optimization problem.

The S4 and STD transfer functions were employed

BIOINFORMATICS 2025 - 16th International Conference on Bioinformatics Models, Methods and Algorithms

566

Figure 3: Predictive model at the start of the hemodialysis session.

Figure 4: Number of HD sessions used to run the model.

for this process (Cisternas-Caneo et al., 2024).

The metaheuristic approach utilized two objective

functions (Eq. 2), both incorporating the number of

features in the model and various metrics to evaluate

the classiﬁer’s performance, minimized over 100 iter-

ations. The ﬁrst version aimed to maximize the recall

of the minority class (hypotensive patients) (Eq. 3),

while the second focused on maximizing the F1-score

of the minority class (Eq. 4). Additionally, the ob-

jective functions were weighted to favor performance

metrics over feature reduction, allowing ﬂexibility in

adapting to the problem’s nature.

minz = 0.99 · Metric+ 0.01 ·

Selected features

Total features

(2)

Metric 1: 1 − Minority Class Recall

(3)

Metric 2: 1 − Minority Class F1-score

(4)

where F1-score and recall are deﬁned as:

F1-Score =

T P

T P + 0.5(FP + FN)

Recall =

T P

T P + FN

and T P is true positives, FN is false negatives and FP

is false positives.

By iterating through these objectives, the meta-

heuristic algorithm effectively selected features that

enhanced the model’s predictive performance while

addressing the dataset’s class imbalance, as shown in

Figure 5. Feedback from the classiﬁer’s performance

metrics guided the metaheuristic process, represent-

ing a novel approach where the selection module in-

tegrates performance testing with various sets of vari-

ables chosen by the metaheuristic.

3 EXPERIMENTAL RESULTS

3.1 Initial Model Performance

Before applying metaheuristics algorithms for feature

selection, the XGBoost model was trained using the

complete set of 67 features. The initial performance

metrics were as follows: an F1-score of 0.7172, pre-

cision of 0.7520, and recall of 0.7244. Given the na-

ture of the problem, accurately predicting the minor-

ity class, i.e., hypotensive episodes, is critical. There-

fore, the study primarily focused on metrics for the

minority class. Speciﬁcally, the recall for the mi-

nority class was emphasized to minimize false neg-

atives, which correspond to cases where hypotensive

episodes are not predicted when they should be. Thus,

the recall and F1-score for the minority class were

the key metrics of interest, with values of 0.7541 and

0.6373, respectively.

3.2 Feature Reduction

Table 1 compares the performance of the XGBoost

model using the original 67 features against the per-

Metaheuristics Applied to Optimal Feature Selection for Accurate Predictive Models in Smart Health: A Case Study on Hypotension

Prediction in Hemodialysis Patients

567

…

Yes

Feature Selection Module

Full Dataset

Best subset

Metaheuristic

Module

PSO

GWO

WOA

PSA

Machine Learning

Module

KNN

Subset with

some features

enabled

Performance

Metrics

finish?

Figure 5: Feature selection process.

formance after feature reduction using PSO and GWO

with both objective function metrics (Eq. 3, Eq. 4).

The best results for the minority class F1-score and

recall were achieved with the PSO algorithm using

the ﬁrst objective function (Eq. 3). Speciﬁcally, the

minority class F1-score and recall values were 0.6419

and 0.7640, respectively. The feature selection pro-

cess reduced the number of features to 36. Also us-

ing PSO algorithm for feature reduction achieved a

46.27% decrease in the initially selected features by

the experts. Additionally, this process resulted in a

slight improvement in the XGBoost model’s perfor-

mance for predicting hypotensive episodes in dialysis

patients. Speciﬁcally, the F1-score for the minority

class increased from 0.6373 to 0.6419, and the re-

call for the minority class improved from 0.7541 to

0.7640.

Fig. 6 presents the Receiver Operating Character-

istic (ROC) curve for the IDH prediction model using

the 36 features selected by PSO algorithm. The Area

Under the Curve (AUC) value is 0.76.

Figure 6: Receiver Operating Characteristic (ROC) curve

and the area under the curve (AUC) for the XGBoost model

trained with the 36 features chosen by the PSO model.

Fig. 7 shows the ﬁtness evolution of the best so-

lution over iterations for the PSO algorithm using the

objective function that included the minority class re-

call as a metric (Eq. 3). The plot illustrates a gradual

decrease in ﬁtness, corresponding to an increase in the

minority class recall. This trend indicates a reduction

in false negatives and, consequently, an improvement

in the predictive performance of the model.

4 CONCLUSION

Predicting hypotensive episodes in chronic kidney

disease patients prior to dialysis is crucial for prevent-

ing complications and ensuring effective treatment.

This study employed the XGBoost machine learn-

ing model to predict such episodes, initially using a

dataset of 67 features selected by medical experts.

Subsequently, metaheuristic algorithms, speciﬁcally

PSO and GWO, were used to reduce the feature set

and enhance the model’s performance.

A key challenge in developing predictive mod-

els for medical applications is the large number of

Figure 7: Fitness evolution.

BIOINFORMATICS 2025 - 16th International Conference on Bioinformatics Models, Methods and Algorithms

568

Table 1: Classiﬁcation results of models using Stratiﬁed k-fold cross-validation method.

Model FO IDH Patient Non-IDH Patient Accuracy Total

Metric Recall F1-score Precision Recall F1-score Precision Features

NO NO 0.7541 (0.007) 0.6373 (0.0060) 0.5518 (0.006) 0.7512 (0.004) 0.8116 (0.003) 0.8826 (0.003) 0.7520 (0.004) 67

PSO 1 0.7640 (0.015) 0.6419 (0.006) 0.5536 (0.005) 0.7497 (0.007) 0.8124 (0.003) 0.8867 (0.006) 0.7538 (0.004) 36

GWO 1 0.7617 (0.012) 0.6352 (0.006) 0.5448 (0.005) 0.7414 (0.006) 0.8067 (0.004) 0.8845 (0.005) 0.7473 (0.004) 38

PSO 2 0.7603 (0.008) 0.6422 (0.007) 0.5559 (0.007) 0.7532 (0.005) 0.8140 (0.004) 0.8855 (0.003) 0.7568 (0.006) 41

GWO 2 0.7560 (0.008) 0.6396 (0.009) 0.5527 (0.010) 0.7504 (0.009) 0.8119 (0.007) 0.8846 (0.004) 0.7547 (0.008) 38

NO, Without Metaheuristics; PSO, Particle Swarm Optimization; GWO, Grey Wolf Optimizer.

features, which can lead to increased computational

complexity and potential overﬁtting. Effective feature

reduction is essential to streamline the model, reduce

computational load, and eliminate irrelevant or redun-

dant information, thereby improving the model’s per-

formance and generalizability.

The study successfully developed a model capa-

ble of identifying IDH events in chronic kidney dis-

ease patients. By applying metaheuristic techniques,

the number of features was signiﬁcantly reduced, and

the predictive accuracy of the model was improved.

The PSO algorithm, in particular, demonstrated a ro-

bust reduction in the number of features, decreasing

the feature set by 46.27% from the original 67 fea-

tures to 36. This substantial reduction in features not

only simpliﬁes the model but also enhances its in-

terpretability and efﬁciency. Reducing the number

of features decreases the computational load, mak-

ing the model faster and more efﬁcient in processing

data. Additionally, eliminating irrelevant and redun-

dant information helps to avoid overﬁtting, thereby

improving the model’s generalizability. This stream-

lined approach also facilitates easier interpretation of

the model’s results, allowing for more straightforward

insights into the key factors inﬂuencing IDH events.

This reduction in characteristics does not compro-

mise the model’s effectiveness. On the contrary, the

optimized model achieved an AUC of 0.76, demon-

strating its capability to distinguish between hypoten-

sive and non-hypotensive episodes. Additionally, it

achieved a recall of 0.764 for the minority class (hy-

potensive episodes) in chronic renal patients prior to

dialysis procedures. This high recall value signiﬁes

a low number of false negatives, meaning that the

model effectively identiﬁes patients with IDH who

might otherwise not be warned of such episodes. The

improvement of recall and F1-score for the minority

class, critical metrics for this medical prediction task,

underlines the effectiveness of using metaheuristics

and, in this case, PSO selects fewer signiﬁcant fea-

tures. Consequently, the optimized model not only

provides reliable predictions of hypotensive episodes

but also operates with greater efﬁciency, facilitating

its potential integration into clinical practice.

In conclusion, this study highlights the importance

of feature reduction in predictive modeling for medi-

cal applications. The use of metaheuristic algorithms,

particularly PSO, has proven to be an effective strat-

egy for enhancing model performance while signiﬁ-

cantly reducing the number of required features. This

approach offers a promising pathway for developing

efﬁcient, accurate, and clinically applicable models

for predicting hypotensive episodes in chronic kidney

disease patients at the beginning of the HD session.

Looking ahead, there are several avenues for fu-

ture research. Our model could be further reﬁned by

exploring additional machine learning techniques or

ensemble methods combining different approaches.

Extending the study to include predictions for other

pathological events during dialysis, such as thrombo-

sis and arrhythmias, could enhance its clinical utility.

Additionally, incorporating the full initial dataset of

200 variables—beyond those selected by medical ex-

perts—and addressing missing data through imputa-

tion techniques would provide a more comprehensive

analysis and potentially improve the model’s predic-

tive capabilities.

ACKNOWLEDGEMENTS

This work is part of the project "Prevention of seri-

ous pathological events in hemodialysis patients by

non-invasive continuous monitoring of vital signs and

analysis of circular biomarkers (ALLPREVENT)",

PMPTA23/00033, which has been funded by the In-

stituto de Salud Carlos III (ISCIII) within the pro-

gramme of R&D projects linked to personalised

medicine and advanced therapies. The authors would

like to thank Dr. Pablo Herrera, MSc. Melisa Granda

and MSc. Daniella Peña for their collaboration in the

study and the company Intelligent Data for their sup-

port and collaboration in research aimed at the design

and joint development of technological products in

the ﬁeld of health. Felipe Cisternas-Caneo and Jose

Barrera-García are both supported by the National

Agency for Research and Development (ANID) un-

der the Doctorado Nacional Scholarship Program.

Metaheuristics Applied to Optimal Feature Selection for Accurate Predictive Models in Smart Health: A Case Study on Hypotension

Prediction in Hemodialysis Patients

569

REFERENCES

Agarwal, R. (2012). How can we prevent intradialytic hy-

potension? Current opinion in nephrology and hyper-

tension, 21(6):593–599.

Aljohani, A. (2024). Optimizing Patient Stratiﬁcation in

Healthcare: A Comparative Analysis of Clustering

Algorithms for EHR Data. International Journal of

Computational Intelligence Systems, 17(1):173.

Barrera-García, J., Cisternas-Caneo, F., Crawford, B.,

Gómez Sánchez, M., and Soto, R. (2023). Feature se-

lection problem and metaheuristics: a systematic lit-

erature review about its formulation, evaluation and

applications. Biomimetics, 9(1):9.

Chen, J. B., Wu, K. C., Moi, S. H., Chuang, L. Y., and Yang,

C. H. (2020). Deep learning for intradialytic hypoten-

sion prediction in hemodialysis patients. IEEE Access,

8:82382–82390.

Cisternas-Caneo, F., Crawford, B., Soto, R., Giachetti, G.,

Paz, Á., and Peña Fritz, Á. (2024). Chaotic binariza-

tion schemes for solving combinatorial optimization

problems using continuous metaheuristics. Mathe-

matics, 12(2).

Ebiaredoh-Mienye, S. A., Swart, T. G., Esenogho, E., and

Mienye, I. D. (2022). A machine learning method

with ﬁlter-based feature selection for improved pre-

diction of chronic kidney disease. Bioengineering,

9(8):350.

Flythe, J. E., Chang, T. I., Gallagher, M. P., Lindley, E.,

Madero, M., Saraﬁdis, P. A., Unruh, M. L., Wang,

A. Y.-M., Weiner, D. E., Cheung, M., et al. (2020).

Blood pressure and volume management in dialysis:

conclusions from a kidney disease: Improving global

outcomes (kdigo) controversies conference. Kidney

international, 97(5):861–876.

Furaz Czerpak, K. R., Puente García, A., Corchete Prats,

E., Moreno de la Higuera, M., Gruss Vergara, E., and

Martín-Hernández, R. (2014). Estrategias para el con-

trol de la hipotensión en hemodiálisis. Nefrología,

6(1):1–14.

Gervasoni, F., Bellocchio, F., Rosenberger, J., Arkossy, O.,

Ion Titapiccolo, J., Kovarova, V., Larkin, J., Nikam,

M., Stuard, S., Tripepi, G. L., Usvyat, L. A., Winter,

A., Neri, L., and Zoccali, C. (2023). Development

and validation of ai-based triage support algorithms

for prevention of intradialytic hypotension. Journal of

Nephrology, 36(7):2001 – 2011. Cited by: 0.

Gómez-Pulido, J. A., Gómez-Pulido, J. M., Rodríguez-

Puyol, D., Polo-Luque, M. L., and Vargas-Lombardo,

M. (2021). Predicting the appearance of hypotension

during hemodialysis sessions using machine learning

classiﬁers. International Journal of Environmental

Research and Public Health, 18(5):2364.

Haupt, R. L. and Haupt, S. E. (2004). Practical Genetic

Algorithms. John Wiley & Sons.

Hong, D., Chang, H., He, X., Zhan, Y., Tong, R., Wu,

X., and Li, G. (2023). Construction of an early alert

system for intradialytic hypotension before initiating

hemodialysis based on machine learning. Kidney Dis-

eases, 9(5):433 – 442. Cited by: 2; All Open Access,

Gold Open Access.

Kim, H. W., Heo, S. J., Kim, M., Lee, J., Park, K. H., Lee,

G., and Kim, B. S. (2022). Deep learning model for

predicting intradialytic hypotension without privacy

infringement: a retrospective two-center study. Fron-

tiers in Medicine, 9:878858.

Lee, H., Moon, S. J., Kim, S. W., Min, J. W., Park,

H. S., Yoon, H. E., and Chung, B. H. (2023).

Prediction of intradialytic hypotension using pre-

dialysis features—a deep learning–based artiﬁcial in-

telligence model. Nephrology Dialysis Transplanta-

tion, 38(10):2310–2320.

Mendoza-Pittí, L., Gómez-Pulido, J. M., Vargas-Lombardo,

M., Gómez-Pulido, J. A., Polo-Luque, M.-L., and

Rodréguez-Puyol, D. (2022). Machine-learning

model to predict the intradialytic hypotension based

on clinical-analytical data. IEEE Access, 10:72065–

72079.

Nayyar, A., Gadhavi, L., and Zaman, N. (2021). Machine

learning in healthcare: review, opportunities and chal-

lenges. Machine Learning and the Internet of Medical

Things in Healthcare, pages 23–45.

Othman, M., Elbasha, A. M., Naga, Y. S., and Moussa,

N. D. (2022). Early prediction of hemodialysis com-

plications employing ensemble techniques. BioMedi-

cal Engineering Online, 21(1). Cited by: 1; All Open

Access, Gold Open Access, Green Open Access.

Pudjihartono, N., Fadason, T., Kempa-Liehr, A. W., and

O’Sullivan, J. M. (2022). A review of feature selec-

tion methods for machine learning-based disease risk

prediction. Frontiers in Bioinformatics, 2:927312.

Rahul, K. and Banyal, R. K. (2022). Metaheuristics ap-

proach to improve data analysis process for the health-

care sector. Procedia Computer Science, 215:98–

103. 4th International Conference on Innovative Data

Communication Technology and Application.

Sociedad Española de Nefrología (2023). La enfermedad

renal crónica en españa 2023. Accessed: 2024-07-12.

Yang, X., Zhao, D., Yu, F., Heidari, A. A., Bano, Y., Ibro-

himov, A., and Chen, X. (2022). Boosted machine

learning model for predicting intradialytic hypoten-

sion using serum biomarkers of nutrition. Computers

in Biology and Medicine, 147:105752.

Zhang, H., Wang, L. C., Chaudhuri, S., Pickering, A.,

Usvyat, L., Larkin, J., and Kotanko, P. (2023). Real-

time prediction of intradialytic hypotension using ma-

chine learning and cloud computing infrastructure.

Nephrology Dialysis Transplantation, 38(7):1761–

1769.

BIOINFORMATICS 2025 - 16th International Conference on Bioinformatics Models, Methods and Algorithms

570