Multimorbidity in Heart Failure Patients: Application of Machine

Learning Algorithms to Predict Imminent Health Outcomes

Jorge Cerejo

1,2 a

, Rui Lopes Baeta

2 b

, Sim

ao Gonc¸alves

1 c

, Bernardo Neves

1,3 d

Pedro Morais Sarmento

3 e

, Jos

e Maria Moreira

1 f

, Nuno Andr

e da Silva

1 g

, Francisca Leite

1 h

Bruno Martins

2 i

and M

ario J. Silva

2 j

Hospital da Luz Learning Health, Luz Sa

ude, Lisboa, Portugal

INESC-ID, Instituto Superior T

ecnico, Universidade de Lisboa, Lisboa, Portugal

Hospital da Luz Lisboa, Luz Sa

ude, Lisboa, Portugal

ﬁ

{bernardo.neves, psarmento}@hospitaldaluz.pt, {jose.maria.moreira, nuno.asilva, francisca.leite}@luzsaude.pt,

Keywords:

Multimorbidity, Heart Failure, Laboratory Tests, Health Outcomes Prediction.

Abstract:

As populations age and life expectancy increases, multimorbidity, which is the simultaneous presence of two

or more chronic conditions, has become increasingly common, especially among older adults. Heart failure,

a widespread and heterogeneous syndrome, has sparked research into multimorbidity to deepen our under-

standing of its pathophysiology and improve clinical management approaches. This paper offers a detailed

characterization of a heart failure patient cohort, utilizing clinical data from a Portuguese tertiary hospital.

Based on this characterization, we developed a clinical tool for identiﬁcation of high-risk patients and pre-

diction of imminent hospital admissions based on laboratory tests. Our models for predicting imminent hos-

pitalization showed reasonable effectiveness (AUROC of 0.79 with lab test prescriptions and 0.72 with lab

test results). These ﬁndings emphasize the signiﬁcant predictive value of laboratory tests in the context of

HF. Additionally, we investigated the explainability of our models using SHAP values, in collaboration with

clinical experts, providing insights into factors inﬂuencing the models’ predictions. These results highlight the

importance of secondary clinical data analysis assisting healthcare professionals in identifying patients at high

risk of adverse events, and improving patient care and outcomes.

1 INTRODUCTION

As life expectancy increases and populations age,

multimorbidity, deﬁned as the presence of two or

more chronic conditions, has become increasingly

prevalent in healthcare systems worldwide (WHO,

2016). Multimorbidity poses signiﬁcant challenges

for patients, clinicians, and healthcare systems, un-

https://orcid.org/0009-0004-5221-4964

https://orcid.org/0009-0006-8350-8302

https://orcid.org/0009-0004-6565-8599

https://orcid.org/0000-0002-1559-7482

https://orcid.org/0000-0002-5970-2707

https://orcid.org/0000-0003-2420-7930

https://orcid.org/0000-0003-4216-2107

https://orcid.org/0000-0003-2550-2616

https://orcid.org/0000-0002-3856-2936

https://orcid.org/0000-0002-5452-6185

derscoring the urgent need for innovative tools to bet-

ter understand and improve clinical outcomes (Maj-

nari

c et al., 2021). In Portugal, the prevalence of mul-

timorbidity is especially high, affecting 78.3% of the

elderly population (aged 65 and older) and 38.3% of

those between the ages of 24 and 75 (Rodrigues et al.,

2018; Romana et al., 2019).

Heart Failure (HF), a clinical syndrome charac-

terized by structural and/or functional cardiac abnor-

malities, is particularly associated with multimorbid-

ity (Bozkurt et al., 2021). Among HF patients, 86%

suffer from at least two chronic conditions, and 42%

live with ﬁve or more (Chamberlain et al., 2015). The

coexistence of conditions complicates treatment de-

cisions and increases the risk of adverse outcomes,

emphasizing the need for improved risk stratiﬁcation

methods (Navickas et al., 2016).

Electronic Health Records (EHRs) offer a valu-

able opportunity for advanced multimorbidity char-

330

Cerejo, J., Baeta, R. L., Gonçalves, S., Neves, B., Sarmento, P. M., Moreira, J. M., da Silva, N. A., Leite, F., Martins, B. and Silva, M. J.

Multimorbidity in Heart Failure Patients: Application of Machine Learning Algorithms to Predict Imminent Health Outcomes.

DOI: 10.5220/0013381800003911

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 18th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2025) - Volume 2: HEALTHINF, pages 330-339

ISBN: 978-989-758-731-3; ISSN: 2184-4305

acterization and enhanced patient outcome predic-

tions (Williams et al., 2022). Prior studies suggest

that the secondary use of EHR data can improve care

quality, reduce medical errors, and generate cost sav-

ings (CMS, 2021). However, translating this data into

actionable insights remains a critical challenge.

In this work, we focus on the development of a

predictive tool designed to forecast imminent hospi-

talizations among HF patients with multimorbidity

using pseudonymized EHR data from a Portuguese

tertiary hospital. As part of the broader Intelligent-

Care project, this study aims to identify high-risk pa-

tients and support clinicians in making timely and in-

formed decisions in the emergency department (ED).

We introduce the ICIHO (IntelligentCare Imminent

Health Outcomes) predictive tool, which leverages

classiﬁcation algorithms to predict imminent health

outcomes (IHO) using routine laboratory test results

collected during ED visits. By utilizing observational

data, the ICIHO tool enhances the early identiﬁcation

of HF patients at heightened risk for hospital admis-

sion, thereby facilitating personalized treatment plan-

ning and potentially improving patient outcomes.

The paper is organized as follows: Section 2 pro-

vides an overview of key concepts and related work

relevant to the study. Section 3 describes the method-

ology used for data processing and prediction model-

ing. The results are presented in Section 4, followed

by a discussion in Section 5. Finally, Section 6 offers

concluding remarks and highlights the clinical impli-

cations of the study.

2 RELATED WORK

EHRs serve as a rich source of patient data, enabling

applications in learning healthcare systems and preci-

sion medicine (Aronson and Rehm, 2015). However,

analyzing EHR data presents signiﬁcant challenges,

including missing data, inaccuracies, and data hetero-

geneity (Hripcsak et al., 2011). To address these is-

sues, data standardization initiatives like the Observa-

tional Medical Outcomes Partnership (OMOP) Com-

mon Data Model (CDM) have emerged, providing

a framework for uniform data extraction and analy-

sis (Sciences and Informatics(OHDSI), 2023). The

OMOP CDM facilitates global data sharing for com-

parative longitudinal studies, making it an essential

tool for analyzing complex patient data (Dixon et al.,

2020; Liyanage et al., 2018).

Laboratory tests are essential in healthcare, serv-

ing roles in diagnosis, monitoring, screening, and re-

search. They play a crucial role in reducing diag-

nostic errors and facilitating informed clinical deci-

sions (Wians, 2009; Plebani and Lippi, 2016). Ab-

normal test values are often early indicators of ad-

verse events, such as increased morbidity and mortal-

ity (Asadollahi et al., 2007). Despite their critical im-

portance, analyzing laboratory data is challenging due

to its inherent heterogeneity. To ensure consistency

and improve interoperability, the healthcare industry

employs the LOINC (Logical Observation Identiﬁers,

Names, and Codes) system, which standardizes the

identiﬁcation and representation of laboratory mea-

surements (Loinc® Indianapolis, IN: Regenstrief In-

stitute, Inc, ).

In recent years, the use of supervised machine

learning (ML) models to predict adverse clinical out-

comes from EHR data has grown substantially (Lee

et al., 2020; Nwanosike et al., 2022). Logistic Re-

gression (LR) models remain widely used due to their

simplicity, interpretability, and effectiveness in pre-

dicting key outcomes such as mortality and hospi-

tal admissions, thereby inﬂuencing clinical decisions

and improving healthcare delivery (Alanazi, 2022).

Meanwhile, more advanced techniques, such as deep

learning models, have gained prominence for their

ability to handle large datasets and extract complex

patterns (Shamout et al., 2020).

An important application of ML in healthcare is

the use of laboratory data to predict IHO. For ex-

ample, Loekito et al. developed a multivariate LR

model utilizing 30 laboratory variables to predict

IHOs (Loekito et al., 2013). Their model demon-

strated its effectiveness by accurately identifying key

outcomes such as Medical Emergency Team (MET)

calls (AUROC = 0.69), ICU admissions (AUROC

= 0.82), and in-hospital mortality (AUROC = 0.90).

Similarly, Mueller et al. employed a comparable ap-

proach to predict in-hospital mortality, achieving high

performance (AUROC = 0.88). Another model in-

tegrating both demographic and laboratory data also

demonstrated strong predictive ability for hospitaliza-

tions (AUROC = 0.80) (Mueller et al., 2021).

3 MATERIAL AND METHODS

Using a combination of data mining and ML tech-

niques, we developed a pipeline for characteriza-

tion of HF patterns with multimorbidity from EHRs.

The pipeline enables the stratiﬁcation of HF patients

based on their risk of adverse events, such as immi-

nent (≤ 24h) hospitalizations. With the developed

pipeline, we processed the clinical records of Hos-

pital da Luz Lisboa (HLL), an institution that pro-

vides comprehensive medical services across all med-

ical specialties.

Multimorbidity in Heart Failure Patients: Application of Machine Learning Algorithms to Predict Imminent Health Outcomes

331

Figure 1 provides a visual overview of the devel-

oped workﬂow, which processes the clinical data in

four main stages:

1. Observational Data Extraction

2. HF Cohort Selection

3. Patient Data Loading

4. Imminent Health Outcomes Prediction

Figure 1: Overview of the developed workﬂow.

3.1 Observational Data Extraction

We utilized the IntelligentCare (ICare) dataset

from Hospital da Luz Lisboa (HLL), containing

anonymized medical histories of 834,529 patients,

spanning January 2007 to August 2021. This dataset

was approved by the hospital’s Institutional Review

Board (IRB) for research on multimorbidity. Data

included patient visits, diagnoses, and laboratory re-

sults, essential for predicting imminent health out-

comes (IHO).

3.2 HF and Comorbidities Phenotyping

We utilized a locally validated phenotyping algorithm

that combined ICD-9 codes and HF-related keywords

in clinical text, as detailed in a prior publication by

our research group (Martins et al., 2024). Comorbidi-

ties were selected based on their prevalence and clin-

ical signiﬁcance, guided by clinical expertise. The

phenotyping rules applied included checks for both

ICD-9 codes and relevant clinical text to ensure accu-

rate cohort identiﬁcation.

3.3 Patient Data Loading

To ensure interoperability, we harmonized the ICare

dataset and uploaded it into an OMOP CDM (version

5.3) data warehouse using tools from the Observa-

tional Health Data Sciences and Informatics (OHDSI)

community. The data warehouse serves as the back-

bone of our analytical processes, enabling standard-

ized analysis through tools like White Rabbit for

ETL design (OHDSI, 2021), Athena for concept map-

ping (Athena, 2022), and Achilles for generating stan-

dard metrics (OHDSI, 2014).

We populated six OMOP CDM tables: Per-

son (demographics), Visit Occurrence (patient vis-

its), Condition Occurrence (diagnoses and comorbidi-

ties), Measurement (laboratory values), Death (mor-

tality), and Observation Period (timeline of observa-

tions). We extracted the medical histories of HF co-

hort patients, preserving the chronological order of di-

agnoses, lab tests, and hospital interactions.

Conditions coded using ICD-9 were translated to

standardized concept IDs via the Condition Relation-

ship table to ensure accurate mapping within the Con-

dition Occurrence table. Laboratory data required ex-

tensive preprocessing due to non-uniform coding. For

unmatched lab codes, we applied heuristics such as

removing the last digit for mapping LOINC standards.

Of the nearly 40 million lab records, 11.4% could not

be mapped and were excluded.

Lab results, which varied from quantitative to

HEALTHINF 2025 - 18th International Conference on Health Informatics

332

nominal or free-text entries, were harmonized using

regular expressions and keywords. Numerical data

were standardized, and categorical results (e.g., ”Pos-

itive”, ”POS”) were uniﬁed under consistent labels.

Finally, we mapped the processed data to OMOP

CDM ﬁelds and loaded them into the data ware-

house using Pentaho Data Integration (Hitachi Ven-

tara, ), completing the ETL process and enabling ro-

bust, standardized analysis for subsequent predictive

modeling.

3.4 Imminent Health Outcome

Prediction

We developed a methodology to predict imminent

hospitalizations of HF patients admitted to the ED us-

ing laboratory data from the ICare dataset. The pri-

mary aim was to evaluate the predictive power of lab

tests in forecasting IHO for HF patients. Only pa-

tients with at least one recorded laboratory test were

included in the analysis.

To predict imminent hospitalizations, we linked

laboratory measurements to subsequent clinical

episodes in each patient’s history. Clinical episodes,

deﬁned as healthcare visits (e.g., hospitalizations, ED

visits, consultations), were recorded in the Visit Oc-

currence table. The algorithm identiﬁed the next clin-

ical episode following each lab measurement and cal-

culated the time difference. Episodes followed by

hospitalization within 24 hours were labeled as posi-

tive (1), while those leading to discharge were labeled

as negative (0).

We ﬁrst trained a multivariate LR model using

binary indicators of lab test prescriptions (1 if pre-

scribed, 0 otherwise). This approach eliminated the

need for actual lab results, mitigating issues with

missing data. By applying the Chi-square test, we

identiﬁed statistically signiﬁcant lab tests for pre-

dicting imminent hospitalizations. The transformed

dataset, in categorical format, enabled the LR model

to capture patterns in physician decision-making.

To evaluate the predictive value of actual lab test

results, we compared LR and neural networks (NN)

models. LR was chosen for interpretability, while

NNs were evaluated for potential performance im-

provements. Statistically signiﬁcant lab tests were

used as features, and NT-proBNP, a critical biomarker

for HF management (Bozkurt et al., 2021), was im-

puted where missing. Class imbalance was addressed

through weight correction. Elastic Net regularization

prevented overﬁtting in the LR model, with hyper-

parameters tuned via 5-fold cross-validation (train-

ing set size = 80%), optimizing the F1-score. The

NN architecture consisted of two hidden layers with

four neurons each (Dervishi, 2020), trained using bi-

nary cross-entropy loss and stochastic gradient de-

scent with backpropagation.

To address the trade-off between the number of

features and the size of the training dataset, we de-

signed variations of LR and NN models with different

numbers of laboratory test features. Missing values in

the lab test results, which are not missing at random,

made imputation not recommended. Instead, rows

with missing values were excluded, meaning mod-

els with more features had smaller training datasets.

This approach aimed to evaluate how feature count

and training data size impacted model performance.

Model performance was assessed using balanced

accuracy, precision, recall, F1-score, AUROC, and

AUPRC (Han et al., 2012). Conﬁdence intervals were

calculated using bootstrapping. To enhance inter-

pretability, we analyzed the LR model’s coefﬁcients

using odds ratios (ORs) and SHAP (SHapley Addi-

tive exPlanations) values (Kasza and Wolfe, 2014;

Lundberg and Lee, 2017). We used SHAP summary

plots to visualize the overall importance of features

and SHAP force plots to analyze individual predic-

tions, offering insights into how speciﬁc lab results

contributed to imminent hospitalization risk (Lund-

berg et al., 2018).

To support real-time interpretation, we developed

the interactive ICIHO predictive tool

. This tool vi-

sualizes predicted hospitalization risk and highlights

the inﬂuence of key lab tests, enhancing clinicians’

ability to make informed decisions based on SHAP-

derived insights.

4 RESULTS

The study population included 3907 patients with HF

(53.4% women) with median age of 81 years (in-

terquartile range 72-88 years old) as depicted in Fig-

ures 2a and 2b. Comorbidities such as cardiovascular

conditions, CKD and Diabetes were highly prevalent

(Table 1).

We analyzed 3,407 patients for imminent outcome

prediction after excluding those without available lab-

oratory test data. These patients contributed to 46,922

distinct episodes, including 27,744 outpatient admis-

sions, 12,686 ED admissions, and 6,488 hospitaliza-

tions. From the ED admissions, 4,693 (37.0% of the

total amount of ED admissions) led to hospital admis-

sions within 24 hours, and were labeled as positives.

A total of 437,683 laboratory tests were conducted

during these ED visits, comprising 252 unique tests.

Available at ICIHO predictive tool

Multimorbidity in Heart Failure Patients: Application of Machine Learning Algorithms to Predict Imminent Health Outcomes

333

(a)

(b)

Figure 2: Gender and Age distributions of the HF cohort.

Our multivariate LR model for the prediction of

imminent hospitalization using only the lab test pre-

scriptions demonstrated a reasonable performance,

with a recall of 0.654, a precision of 0.616, and an

F1-score of 0.634, with an AUROC of 0.785 and an

AUPRC of 0.707. Models trained on lab test re-

sults demonstrated similar performance, summarized

in Table 2. A comparative analysis on model perfor-

mance based, on the ROC curve and precision-recall

curve, is provided in Figure 3. Overall, the NN mod-

els slightly outperformed the LR models. Models

that included more features, namely LR 1 and NN

4, achieved higher performance, while models with

fewer features, namely LR 3 and NN 6, exhibited a

decrease in performance. This may indicate a lower

signiﬁcance power of these features in predicting im-

minent hospitalizations among patients admitted to

Table 1: Prevalence of the 10 most frequent chronic condi-

tions identiﬁed in the population of HF patients.

Condition Prevalence

1 Essential hypertension 56%

2 Atrial ﬁbrillation 33%

3 Dyslipidemia 27%

4 Chronic kidney disease 24%

Ischemic congestive

cardiomyopathy

23%

6 Obesity 18%

7 Heart valve disorder 16%

8 Diabetes mellitus 14%

9 Allergic disposition 13%

10 Bacterial pneumonia 12%

(a)

(b)

Figure 3: Comparison of (a) ROC curves and (b) precision-

recall curves for the different models that were trained. All

models perform similarly, with AUROC ranging between

0.71 and 0.73 and AUPRC ranging between 0.60 and 0.69.

the ED. We selected the LR 2 model, which is high-

lighted in Table 2, as the optimal approach consider-

ing the trade-off between model complexity and per-

formance. This model demonstrated reasonable per-

formance, with AUROC and AUPRC values of 0.718

and 0.663, respectively.

HEALTHINF 2025 - 18th International Conference on Health Informatics

334

Table 2: Summary of the performance metrics computed to evaluate the models trained for predicting imminent hospitaliza-

tions of HF patients admitted to the ED. The model variations (LR 1, LR 2, LR 3, NN 1, NN 2, NN 3) evaluate the trade-off

between the number of features and the size of the training dataset.

Model LR 1 LR 2 LR 3 NN 4 NN 5 NN 6

Proportion 50% 75% 90% 50% 75% 90%

Nr. features 19 16 9 19 16 9

Nr. samples 3685 8599 12014 3685 8599 12014

Bal accuracy

[95% CI]

0.660

[0.627-0.692]

0.664

[0.640-0.688]

0.659

[0.640-0.679]

0.672

[0.638-0.704]

0.673

[0.652-0.697]

0.664

[0.646-0.685]

Precision

[95% CI]

0.643

[0.593-0.693]

0.610

[0.575-0.646]

0.555

[0.526-0.586]

0.643

[0.594-0.692]

0.619

[0.578-0.644]

0.558

[0.529-0.588]

Recall

[95% CI]

0.609

[0.556-0.658]

0.624

[0.589-0.659]

0.624

[0.591-0.654]

0.656

[0.610-0.704]

0.653

[0.630-0.698]

0.636

[0.606-0.668]

F1-score

[95% CI]

0.625

[0.584-0.667]

0.617

[0.575-0.646]

0.587

[0.561-0.613]

0.649

[0.610-0.690]

0.635

[0.608-0.664]

0.595

[0.569-0.620]

AUROC

[95% CI]

0.709

[0.673-0.743]

0.718

[0.693-0.743]

0.711

[0.691-0.733]

0.726

[0.690-0.761]

0.732

[0.708-0.756]

0.726

[0.706-0.747]

AUPRC

[95% CI]

0.668

[0.614-0.720]

0.663

[0.630-0.701]

0.596

[0.564-0.631]

0.691

[0.634-0.744]

0.682

[0.649-0.719]

0.605

[0.571-0.640]

The odds ratio values corresponding to the coefﬁ-

cients of model LR2 are displayed in Table 3. These

values reveal that C-reactive protein (CRP) and NT-

proBNP are the laboratory tests that most signiﬁcantly

inﬂuence the prediction of imminent hospitalizations

of patients admitted to the ED. The laboratory tests of

Erythrocytes and Lymphocytes exhibited lower odds

values, indicating a negative contribution to imminent

hospitalizations. Despite age being considered a risk

factor for multimorbidity, its inﬂuence in predicting

imminent hospitalizations of HF patients is reduced.

Table 3 lists the P-values associated with each

coefﬁcient of the multivariate model. Certain vari-

ables, such as Hematocrit, Creatinine, Sodium, and

Monocytes, seem to be statistically insigniﬁcant in

predicting imminent hospitalization in a multivariate

approach.

The SHAP summary plot shown in Figure 4a fur-

ther reinforces these ﬁndings. High values of CRP,

Urea, Leukocytes, and NT-proBNP are associated

with higher SHAP values, indicating their strong posi-

tive inﬂuence on the model’s predictions. Conversely,

low values of Erythrocytes are associated with higher

contributions to the prediction of imminent hospital-

izations.

Figure 4b showcases a SHAP force plot example,

illustrating the prediction of imminent hospitalization

of a HF patient from the test dataset who was cor-

rectly classiﬁed. The model identiﬁed that values of

Erythrocytes (789-8=2.16 counts10

/L), Leukocytes

(6690-2=16.22 counts10

/L), and CRP (1988-5=8.65

mg/dL) had a signiﬁcant impact on predicting immi-

nent hospitalizations, as indicated by the wider red

bars in the plot.

5 DISCUSSION

We developed a framework for processing clinical

data to gain insights on the multimorbidity popula-

tion with HF, uncovering patterns and risks associ-

ated to this condition. The ability to utilize healthcare

data for better characterization of complex patients

and the development of clinical strategies represents

a step forward in the management of HF. Compared

to related works, our model achieves similar perfor-

mance while uniquely incorporating feature contribu-

tion analysis using odds ratios and SHAP. Addition-

ally, we developed a user-friendly web interface to

visualize predictions and feature impacts, supporting

clinical decision-making.

Firstly, our multivariate analysis, which focused

on lab test prescriptions for HF patients admitted to

the ED, enables early identiﬁcation of patients at an

increased risk of imminent hospitalization. We be-

lieve that by exploring the rationale behind each lab

test prescription, we can partially reveal the intricate

clinical judgments and organizational factors inﬂu-

encing these decisions. This approach opens up new

research avenues for clinical and operational improve-

ments in high-demand settings, such as the ED.

In addition, we have shown the practical utility

of commonly available laboratory test results in con-

ducting risk stratiﬁcation to predict short-term hospi-

tal admissions for HF patients. These models were

proﬁcient in making reasonably accurate predictions

of hospital admission. We are optimistic that integrat-

ing additional information like demographics, vital

signs, and diagnoses can further enhance the models’

discrimination capabilities. Using these data, clini-

Multimorbidity in Heart Failure Patients: Application of Machine Learning Algorithms to Predict Imminent Health Outcomes

335

Table 3: Imminent Hospitalization odds ratio for each coefﬁcient of the prediction named model LR 2. The p-values for the

null hypothesis that a coefﬁcient is equal to zero (i.e., the odds ratio is equal to one), were computed with a Wald test (Wald,

1943).

Variable Component odds ratio (P > |z|)

1988-5 C-reactive protein 1.4689 (< 0.05)

33762-6 NT-proBNP 1.2919 (< 0.05)

6690-2 Leukocytes 1.2359 (< 0.05)

22664-7 Urea 1.2339 (< 0.05)

788-0 Erythrocyte distribution width 1.1745 (< 0.05)

4544-3 Hematocrit 1.1118 (0.068)

Age Age 1.0597 (< 0.05)

2823-3 Potassium 1.0581 (< 0.05)

40248-7 Creatinineˆbaseline 1.0414 (0.353)

2951-2 Sodium 1.0110 (0.805)

5905-5 Monocytes/100 leukocytes 0.9764 (0.337)

785-6 Erythrocyte mean corpuscular hemoglobin 0.9037 (< 0.05)

713-8 Eosinophils/100 leukocytes 0.8829 (< 0.05)

2075-0 Chloride 0.8816 (< 0.05)

736-9 Lymphocytes/100 leukocytes 0.8463 (< 0.05)

789-8 Erythrocytes 0.7980 (< 0.05)

cians can more accurately gauge the need for hospital

admission of these patients, and hospital staff can ob-

tain early estimates of admission rates that can, for

instance, lead to improved efﬁciency in hospital bed

planning and resource allocation.

Our machine learning models prioritized inter-

pretability, thereby enhancing trust and clinical ap-

plicability. Rigorous evaluation using logistic re-

gression weights and SHAP values ensured transpar-

ent and practically relevant outcomes, vital for real-

world applications and future research (Lundberg and

Lee, 2017). This methodology permits detailed in-

terpretability, allowing clinicians to perform individ-

ualized patient risk assessments, signiﬁcantly improv-

ing clinical utility. Our ﬁndings indicate that ele-

vated levels of NT-proBNP and CRP are positively

correlated with imminent hospital admission, consis-

tent with prior studies (Bozkurt et al., 2021; Anand

et al., 2005). NT-proBNP is a well-established marker

of HF severity, while the link between CRP and HF

prognosis, suggesting possible concurrent infections

or unaddressed inﬂammatory diseases, warrants fur-

ther investigation. Moreover, we identiﬁed a neg-

ative correlation between erythrocyte concentration

and patient outcomes, reinforcing existing evidence

of anemia’s adverse impact on HF prognoses, includ-

ing higher hospitalization rates (Anand et al., 2004).

Intriguingly, our analysis revealed that when labora-

tory data is included, age becomes a less signiﬁcant

predictor of imminent hospitalization.

Furthermore, the study highlights the utility of the

SHAP force plot in assessing individual patient risks,

offering a detailed insight into the speciﬁc impacts of

features on model predictions. This analytical tool

increases the model’s clinical relevance by elucidat-

ing not just the direction but also the magnitude of a

feature’s impact on predictions of imminent hospital-

ization. Such precise feature-level interpretability is

invaluable for predicting heightened risks in scenar-

ios with interacting factors, rendering it a potent in-

strument for handling complex clinical scenarios like

multimorbidity.

This study also has limitations that warrant discus-

sion. A primary limitation is the analyzed data com-

ing from a single hospital, which may be the source of

biases associated with the diversity of diseases treated

and the complexity of healthcare delivery at this facil-

ity. Although the hospital provides a range of care ser-

vices, the lack of universal primary healthcare could

limit the scope of our analysis, thus restricting the

depth of insights into disease complexity and nuances

in healthcare provision. Moreover, our focus on lab-

oratory test prescriptions might overlook essential as-

pects of patient histories and experiences prior to ad-

mission to the ED, which are captured in different

data types. Integration with alternative data sources,

such as clinical notes and drug prescriptions, could

bolster conﬁdence in our ﬁndings. Our observed cor-

relations between prescription patterns and clinical

outcomes do suggest implicit clinical and organiza-

tional processes that are interesting research avenues

to explore. However, extra caution is advised in in-

terpreting their signiﬁcance. Not only is further ex-

ternal validation necessary, but there is also a need

to be mindful of potential biases that this approach

may introduce and perpetuate, such as discrimination

HEALTHINF 2025 - 18th International Conference on Health Informatics

336

(a) SHAP summary plot

(b) SHAP force plot

Figure 4: (a) SHAP summary plot computed using laboratory results and age of all episodes of the test dataseet. (b) SHAP

force plot of a HF patient correctly classiﬁed as imminent hospitalization, in which colour red represents the lab tests results

(or age) that are increasing the chance of imminent hospitalization while colour blue represents the negative outcome.

against underrepresented subpopulations. This issue

has been increasingly recognized in the literature and

merits further investigation before any clinical imple-

mentation (Obermeyer et al., 2019).

Additionally, the inclusion of more comprehen-

sive clinical information could enhance data interpre-

tation and improve algorithm performance. For ex-

ample, ejection fraction, a key factor in heart failure

(HF), was not included in our study. Often recorded in

free text, this information presents challenges for sys-

tematic and reliable extraction and was consequently

not utilized. We intend to address this limitation in

future work.

Finally, it is essential to emphasize that the tools

developed in this study just went through an initial

trial stages. The utilization of AI systems to aid clin-

ical decision-making represents a signiﬁcant innova-

tion in healthcare. However, their adoption requires

meticulous evaluation and strict adherence to regula-

tions, particularly in Europe. The tools and method-

ologies described in our research serve as illustrations

of potential approaches to enhance clinical decision-

making by leveraging existing clinical data with its

inherent limitations. Prior to their implementation in

Multimorbidity in Heart Failure Patients: Application of Machine Learning Algorithms to Predict Imminent Health Outcomes

337

real-world clinical settings, these tools must undergo

comprehensive regulatory processes to ensure their

safety, effectiveness, and ethical compliance.

6 CONCLUSION

In conclusion, this study has successfully developed a

comprehensive framework for analyzing clinical data,

particularly in the context of patients with HF with

concurrent multimorbidity. Our approach contributes

to more reﬁned risk stratiﬁcation and informed clin-

ical decision-making. This methodology showcases

the potential of healthcare data to improve clinical

insight and individualized risk assessment that can

eventually lead to better patient outcomes.

Looking ahead, our research lays the groundwork

for future investigations to enhance predictive mod-

els that make use of laboratory data and delve deeper

into the impacts of various comorbidities on HF out-

comes on long-term perspective. Prioritizing the ex-

pansion of data collection methods, will be essential

to enrich the quality and relevance of the data in fu-

ture studies. Furthermore, the versatility of our an-

alytical framework holds promise for broader appli-

cations, extending to diverse patient populations with

chronic conditions such as Diabetes Mellitus, Chronic

Kidney Disease, and Chronic Obstructive Pulmonary

Disease. We intend to address these in future work.

FUNDING

This work was developed under the IntelligentCare

project LISBOA-01-0247-FEDER-045948 which is

co-ﬁnanced by the ERDF/LISBOA2020 and by FCT,

Portugal, under CMU-Portugal and by FCT, Por-

tugal, through the INESC-ID Research Unit, ref.

UIDB/00408/2020 and ref. UIDP/00408/2020.

ACKNOWLEDGEMENTS

We acknowledge Carlos Magalh

aes and Jaime

Machado, from Hospital da Luz, for the data extrac-

tion.

REFERENCES

Alanazi, A. (2022). Using machine learning for healthcare

challenges and opportunities. Informatics in Medicine

Unlocked, 30:100924.

Anand, I., McMurray, J. J., Whitmore, J., Warren, M.,

Pham, A., McCamish, M. A., and Burton, P. B.

(2004). Anemia and Its Relationship to Clinical Out-

come in Heart Failure. Circulation, 110(2):149–154.

Anand, I. S., Latini, R., Florea, V. G., Kuskowski, M. A.,

Rector, T., Masson, S., Signorini, S., Mocarelli, P.,

Hester, A., Glazer, R., and Cohn, J. N. (2005).

C-Reactive Protein in Heart Failure. Circulation,

112(10):1428–1434.

Aronson, S. J. and Rehm, H. L. (2015). Building the foun-

dation for genomics in precision medicine. Nature,

526(7573):336–342.

Asadollahi, K., Hastings, I., Beeching, N., and Gill, G.

(2007). Laboratory risk factors for hospital mortality

in acutely admitted patients. QJM : monthly journal

of the Association of Physicians, 100:501–7.

Athena (2022). Ohdsi athena. (accessed: 13.01.2023).

Bozkurt, B., Coats, A. J., Tsutsui, H., and et al. (2021).

Universal Deﬁnition and Classiﬁcation of Heart Fail-

ure: A Report of the Heart Failure Society of America,

Heart Failure Association of the European Society of

Cardiology, Japanese Heart Failure Society and Writ-

ing Committee of the Universal Deﬁnition of Heart

Failure. Journal of Cardiac Failure, 27(4):387–413.

Chamberlain, A. M., Sauver, J. L., Gerber, Y., Manemann,

S. M., Boyd, C. M., Dunlay, S. M., Rocca, W. A., Rut-

ten, L. J., Jiang, R., Weston, S. A., and Roger, V. L.

(2015). Multimorbidity in Heart Failure: A Commu-

nity Perspective. The American journal of medicine,

128(1):38.

CMS (2021). Electronic Health Records.

Dervishi, A. (2020). A deep learning backcasting approach

to the electrolyte, metabolite, and acid-base param-

eters that predict risk in icu patients. PLOS ONE,

15(12):1–19.

Dixon, B. E., Wen, C., French, T., Williams, J. L., Duke,

J. D., and Grannis, S. J. (2020). Observational Health

Data Science and Informatics (OHDSI). BMJ Health

Care Inform, 27:100054.

Han, J., Kamber, M., and Pei, J. (2012). Data mining: Data

mining concepts and techniques. Morgan Kaufmann,

3 edition.

Hitachi Ventara. Pentaho Data Integration & Analytics.

Hripcsak, G., Knirsch, C., Zhou, L., Wilcox, A., and

Melton, G. (2011). Bias associated with mining elec-

tronic health records. Journal of Biomedical Discov-

ery and Collaboration, 6:48–52.

Kasza, J. and Wolfe, R. (2014). Interpretation of com-

monly used statistical regression models. Respirology,

19(1):14–21.

Lee, T. C., Shah, N. H., Haack, A., and Baxter, S. L. (2020).

Clinical implementation of predictive models embed-

ded within electronic health record systems: A sys-

tematic review. Informatics, 7(3):25. Epub 2020 Jul

25.

Liyanage, H., Liaw, S. T., Jonnagaddala, J., Hinton, W.,

and De Lusignan, S. (2018). Common Data Models

(CDMs) to Enhance International Big Data Analytics:

A Diabetes Use Case to Compare Three CDMs. Stud-

ies in Health Technology and Informatics, 255:60–64.

HEALTHINF 2025 - 18th International Conference on Health Informatics

338

Loekito, E., Bailey, J., Bellomo, R., Hart, G. K., Hegarty,

C., Davey, P., Bain, C., Pilcher, D., and Schneider, H.

(2013). Common laboratory tests predict imminent

medical emergency team calls, intensive care unit ad-

mission or death in emergency department patients.

Emergency Medicine Australasia, 25(2):132–139.

Loinc® Indianapolis, IN: Regenstrief Institute, Inc. Logical

Observation Identiﬁers Names and Codes (LOINC).

Lundberg, S. M. and Lee, S.-I. (2017). A uniﬁed approach

to interpreting model predictions. In Proceedings of

the 31st International Conference on Neural Informa-

tion Processing Systems, NIPS’17, page 4768–4777,

Red Hook, NY, USA. Curran Associates Inc.

Lundberg, S. M., Nair, B., Vavilala, M. S., Horibe, M.,

Eisses, M. J., Adams, T., Liston, D. E., Low, D. K.-

W., Newman, S.-F., Kim, J., et al. (2018). Explain-

able machine-learning predictions for the prevention

of hypoxaemia during surgery. Nature Biomedical En-

gineering, 2(10):749.

Majnari

c, L. T., Babi

c, F., O’Sullivan, S., and Holzinger, A.

(2021). AI and big data in healthcare: Towards a more

comprehensive research framework for multimorbid-

ity. Journal of Clinical Medicine, 10(4):766. Num-

ber: 4 Publisher: Multidisciplinary Digital Publishing

Institute.

Martins, C., Neves, B., Teixeira, A. S., Froes, M., Sarmento,

P., Machado, J., Magalh

aes, C., Silva, N., Silva, M.,

and Leite, F. (2024). Identifying subgroups in heart

failure patients with multimorbidity by clustering and

network analysis. BMC Medical Informatics and De-

cision Making, 24(1):95.

Mueller, O., Rentsch, K., Nickel, C., and Bingisser, R.

(2021). Disposition decision support by labora-

tory based outcome prediction. Journal of Clinical

Medicine, 10:939.

Navickas, R., Petric, V.-K., Feigl, A. B., and Seychell, M.

(2016). Multimorbidity: What Do We Know? What

Should We Do? Journal of Comorbidity, 6(1):4–11.

Nwanosike, E. M., Conway, B. R., Merchant, H. A., and

Hasan, S. S. (2022). Potential applications and perfor-

mance of machine learning techniques and algorithms

in clinical practice: A systematic review. Interna-

tional Journal of Medical Informatics, 159:104679.

Obermeyer, Z., Powers, B., Vogeli, C., and Mullainathan,

S. (2019). Dissecting racial bias in an algorithm

used to manage the health of populations. Science,

366(6464):447–453.

OHDSI (2014). Achilles for data characterization. (ac-

cessed: 13.01.2023).

OHDSI (2021). Whiterabbit for etl design. (accessed:

13.01.2023).

Plebani, M. and Lippi, G. (2016). Improving diagnosis and

reducing diagnostic errors: The next frontier of labo-

ratory medicine. Clinical Chemistry and Laboratory

Medicine, 54(7):1117–1118.

Rodrigues, A. M., Greg

orio, M. J., Sousa, R. D., Dias, S. S.,

Santos, M. J., Mendes, J. M., Coelho, P. S., Branco,

J. C., and Canh

ao, H. (2018). Challenges of ageing in

portugal: Data from the EpiDoC cohort. Acta Medica

Portuguesa, 31(2):80–93.

Romana, G. Q., Kislaya, I., Salvador, M. R., Cunha-

Goncalves, S., Nunes, B., and Dias, C. (2019). Mul-

timorbidity in portugal: results from the ﬁrst national

health examination survey. Acta M

edica Portuguesa,

32(1).

Sciences, O. H. D. and Informatics(OHDSI) (2023). Stan-

dardized data: The omop common data model. (ac-

cessed: 22.11.2023).

Shamout, F., Zhu, T., and Clifton, D. (2020). Machine

learning for clinical outcome prediction. IEEE Re-

views in Biomedical Engineering, PP:1–1.

Wald, A. (1943). Tests of statistical hypotheses concerning

several parameters when the number of observations

is large. Transactions of the American Mathematical

Society, 54:426–482.

WHO (2016). Multimorbidity Technical Series on Safer

Primary Care Multimorbidity: Technical Series on

Safer Primary Care. page 28.

Wians, F. H. (2009). Clinical laboratory tests: Which, why,

and what do the results mean? Laboratory Medicine,

40(2):105–113.

Williams, T. B., Garza, M., Lipchitz, R., Powell, T., Baghal,

A., Swindle, T., and Sexton, K. W. (2022). Cultivating

informatics capacity for multimorbidity: A learning

health systems use case. Journal of Multimorbidity

and Comorbidity, 12. Publisher: SAGE Publications.

Multimorbidity in Heart Failure Patients: Application of Machine Learning Algorithms to Predict Imminent Health Outcomes

339