Optimizing Blood Transfusions and Predicting Shortages in

Resource-Constrained Areas

El Arbi Belfarsi

, Sophie Brubaker

and Maria Valero

Kennesaw State University, 1100 South Marietta Pkwy SE, Marietta, GA 30060, U.S.A.

Keywords:

Artiﬁcial Intelligence, Machine Learning, Heuristic Search, Constrained Optimization, NoSQL Databases,

Blood Donation Systems, Healthcare Logistics, Public Health.

Abstract:

Our research addresses the critical challenge of managing blood transfusions and optimizing allocation in

resource-constrained regions. We present heuristic matching algorithms for donor-patient and blood bank se-

lection, alongside machine learning methods to analyze blood transfusion acceptance data and predict potential

shortages. We developed simulations to optimize blood bank operations, progressing from random allocation

to a system incorporating proximity-based selection, blood type compatibility, expiration prioritization, and

rarity scores. Moving from blind matching to a heuristic-based approach yielded a 28.6% marginal improve-

ment in blood request acceptance, while a multi-level heuristic matching resulted in a 47.6% improvement.

For shortage prediction, we compared Long Short-Term Memory (LSTM) networks, Linear Regression, and

AutoRegressive Integrated Moving Average (ARIMA) models, trained on 170 days of historical data. Linear

Regression slightly outperformed others with a 1.40% average absolute percentage difference in predictions.

Our solution leverages a Cassandra NoSQL database, integrating heuristic optimization and shortage pre-

diction to proactively manage blood resources. This scalable approach, designed for resource-constrained

environments, considers factors such as proximity, blood type compatibility, inventory expiration, and rarity.

Future developments will incorporate real-world data and additional variables to improve prediction accuracy

and optimization performance.

1 INTRODUCTION

Blood reserve shortages represent a critical challenge

for healthcare systems worldwide, particularly in low-

income and disaster-prone areas. The World Health

Organization (WHO) recommends that 1% to 3% of

a country’s population should donate blood to meet its

basic healthcare needs (Talie et al., 2020). This guid-

ance emphasizes the importance of regular and vol-

untary blood donations to ensure a consistent and ade-

quate blood supply. However, numerous countries fall

well below this threshold, creating tremendous strain

on their healthcare systems (Kralievits et al., 2015;

Barnes et al., 2022; Roberts et al., 2019).

Figure 1 illustrates the signiﬁcant disparity in

blood donation rates across countries of different in-

come levels. According to the WHO report from

June 2023, these rates vary substantially based on

a country’s economic status (World Health Organi-

https://orcid.org/0009-0006-4145-5492

https://orcid.org/0009-0000-3708-9000

https://orcid.org/0000-0001-8913-9604

Figure 1: Blood donation rates per 1000 people by country

income level (World Health Organization, 2023).

zation, 2023). The data, collected from samples of

1000 people, reveals a clear descending trend from

high- to low-income countries. High-income coun-

tries lead with 31.5 donations, followed by upper-

middle-income countries with 16.4 donations, lower-

middle-income countries with 6.6 donations, and ﬁ-

nally, low-income countries with the lowest rate of

Belfarsi, E. A., Brubaker, S. and Valero, M.

Optimizing Blood Transfusions and Predicting Shortages in Resource-Constrained Areas.

DOI: 10.5220/0013182700003911

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 18th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2025) - Volume 2: HEALTHINF, pages 149-160

ISBN: 978-989-758-731-3; ISSN: 2184-4305

149

5.0 donations.

These statistics underscore the disparities in blood

donation rates across global economic divides. In

low-income countries, the rate is as low as 5.0 dona-

tions per 1,000 people, which falls dramatically short

of the World Health Organization’s median recom-

mendation of 10–20 donations per 1,000 people to

meet essential healthcare needs. This gap of approxi-

mately 67% below the recommended levels highlights

the severe challenges these countries face in maintain-

ing adequate blood supplies. Such shortages strain

healthcare systems, especially in responding to medi-

cal emergencies and routine transfusion requirements,

ultimately compromising patient care and increasing

mortality risks (Raykar et al., 2015).

Another critical factor to consider is the height-

ened occurrence of blood shortages in areas affected

by natural disasters or conﬂict, where the demand for

blood can spike unpredictably, overwhelming already

overburdened healthcare systems. For example, the

COVID-19 pandemic caused signiﬁcant disruptions

to global blood donations due to lockdowns, fear of

infection, and limited access to blood collection sites.

This led to severe blood shortages in countries like the

United States, resulting in postponed surgeries and

delayed transfusions, which had a serious impact on

patient care (Riley et al., 2021). Similarly, during the

Delta variant surge, the blood supply was critically

low, necessitating rapid inventory management to en-

sure continuity of clinical care (Petersen and Jhala,

2023). These crises highlight the need for robust, eas-

ily deployable systems that can optimize blood trans-

actions and predict shortages in resource-constrained

environments to mitigate the impacts of sudden de-

mand surges (Van Denakker et al., 2023).

Moreover, in many low- and middle-income coun-

tries, blood donation systems heavily depend on fam-

ily or replacement donors, with patients or their fam-

ilies often responsible for transferring blood between

centers. For instance, in regions such as Africa and

Latin America, families frequently coordinate blood

donations and transport due to inadequate centralized

systems (World Health Organization, 2022). Greece,

where family donations are vital for conditions like

thalassemia, also exempliﬁes how families play a cru-

cial role in managing blood transfers (American So-

ciety of Hematology, 2020).

Given these challenges, there is a pressing need

for innovative strategies to optimize blood supply

management and accurately predict shortages, par-

ticularly in resource-limited areas. The WHO advo-

cates for a transition to 100% voluntary, unpaid dona-

tions as a critical step towards establishing sustainable

blood supplies in these regions. However, achieving

this objective requires more than policy reforms—it

calls for the creation of scalable, reliable systems

capable of functioning effectively even under high-

pressure conditions and resource constraints (Laer-

mans et al., 2022).

This paper introduces a novel approach that ap-

plies heuristic methods and machine learning tech-

niques independently to optimize blood transactions

and predict shortages in resource-constrained areas.

Our system leverages the power of data-driven predic-

tive modeling along with domain-speciﬁc heuristics

to create a more efﬁcient and effective blood supply

management solution.

Key features of our proposed system include:

• Heuristic approaches: These approaches include

the utilization of blood compatibility matrices to

maximize the utility of available blood supplies,

the implementation of a rarity score for each

blood group to prioritize scarce resources, consid-

eration of blood product expiration dates to mini-

mize wastage, and optimization of travel distances

to improve logistics and reduce costs.

• Machine learning components: The machine

learning components involve predictive models to

forecast a center’s request acceptance rate over a

10-day period and data-driven algorithms to opti-

mize the direction of blood donations.

• Integration of heuristics and machine learning:

A combined approach leverages the strengths of

both methodologies, employing goal-oriented op-

timization to maximize the percentage of accepted

blood requests.

This system is designed to tackle the signiﬁcant

disparities in blood donation and distribution, with a

speciﬁc emphasis on low-income countries, conﬂict-

affected areas, and regions vulnerable to natural dis-

asters. Our solution aims to offer a scalable frame-

work that can enhance blood supply management in

these high-need regions, addressing the complex fac-

tors that affect blood donation, storage, and distribu-

tion.

2 BACKGROUND

2.1 Blood Type Distribution and

Compatibility

Understanding the distribution of blood types in a

population and their compatibility is crucial for op-

timizing blood donation and distribution strategies.

This section provides an overview of typical blood

type distribution and compatibility.

HEALTHINF 2025 - 18th International Conference on Health Informatics

150

2.1.1 Blood Type Distribution

The distribution of ABO blood types can vary across

different populations. Table 1 presents a typical dis-

tribution based on data from a large sample:

Table 1: ABO blood types distribution.

Blood Type Abundance

A 33.4%

B 6%

O 56.8%

AB 3.8%

This distribution shows that Types A and O are

the most common, followed closely by Type B, while

Type AB is the least common. These percentages can

inform predictions about blood supply and demand in

a given population (Sarhan et al., 2009).

2.1.2 Blood Type Compatibility

Blood type compatibility is a critical factor in transfu-

sions and donation strategies. Figure 2 illustrates the

compatibility matrix between different blood types.

Figure 2: Blood type compatibility matrix (BCS, 2022).

The compatibility matrix illustrates the safe blood

donation and transfusion pathways between different

blood types. Key insights include the following:

• Type O is the universal donor, able to give blood

to all other types.

• Type AB is the universal recipient, able to receive

blood from all other types.

• Types A and B can donate to their own type and

to AB.

The blood type distribution data and compatibility

matrix are integral to our methodology. We use this

information to simulate realistic donor populations

and blood transaction scenarios. Moreover, the com-

patibility matrix informs our heuristic search algo-

rithm, enabling efﬁcient matching of blood requests

with available inventory. Additionally, we assign rar-

ity scores to each blood group based on this data,

which plays a key role in optimizing the allocation

process.

3 RELATED WORK

3.1 Blood Bank Management Systems

Recent advancements in blood bank management

have increasingly focused on leveraging emerging

technologies to address operational inefﬁciencies. A

signiﬁcant contribution by (Sandaruwan et al., 2020)

integrates machine learning, data clustering, and

blockchain technologies. A key innovation is the ap-

plication of Long Short-Term Memory (LSTM) net-

works for blood demand forecasting, which optimizes

inventory management by improving demand predic-

tion accuracy and operational efﬁciency. Figure 3 il-

lustrates the comparison between actual and predicted

blood demand across all blood types.

Figure 3: Blood demand prediction: actual values versus

predicted values (Sandaruwan et al., 2020).

Their system was validated using empirical data

from the Blood Bank of Sri Lanka, with results indi-

cating a strong potential for accurately forecasting fu-

ture blood demand. The incorporation of blockchain

technology enhances the system by introducing an ad-

ditional layer of security, ensuring the integrity, trace-

ability, and transparency of the blood supply chain.

Several critical points warrant consideration re-

garding the methodology and experimental design of

this study: 1) the authors employed a highly imbal-

anced data split (98.2% for training and 0.2% for test-

ing), which may raise concerns about the possibility

of overﬁtting and could affect the reliability of the re-

ported performance metrics; and 2) while the LSTM

approach is promising, the study does not provide a

comparison with other time series models, making it

difﬁcult to fully evaluate the relative effectiveness of

this method in the context of blood demand forecast-

ing (Sandaruwan et al., 2020).

Optimizing Blood Transfusions and Predicting Shortages in Resource-Constrained Areas

151

Beyond Sandaruwan’s work, (Ben Elmir et al.,

2023) proposed a smart platform that uses machine

learning and time series forecasting models to reduce

blood shortages and waste by balancing blood col-

lection and distribution. Their approach achieved an

11% increase in collected blood volume and a 20%

reduction in inventory wastage compared to histori-

cal data. Moreover, (Shih and Rajendran, 2019) com-

pared machine learning algorithms and traditional

time series methods like ARIMA for forecasting

blood supply in Taiwan’s blood services. Their results

showed that time series methods, particularly sea-

sonal ARIMA, outperformed machine learning mod-

els in predicting blood demand with better accuracy

and lower error rates.

Recent studies also explore blockchain applica-

tions in blood bank management. (Wijayathilaka

et al., 2020) developed the “LifeShare” platform, in-

corporating blockchain technology for secure, trans-

parent tracking of donors and blood supply, coupled

with machine learning algorithms for blood demand

forecasting.

Despite these advancements, challenges remain.

For example, (Farrington et al., 2023) applied deep

reinforcement learning (DRL) to optimize platelet

inventory management. Their results demonstrated

DRL’s effectiveness in reducing wastage while ensur-

ing sufﬁcient supply for hospitals, offering a promis-

ing avenue for future research in optimizing other

blood components.

These studies provide a robust foundation for

further research, particularly in combining machine

learning models with blockchain technologies to en-

hance the security, accuracy, and efﬁciency of blood

bank operations.

Our work builds upon this concept by:

• Employing a more rigorous model evaluation and

benchmarking methodology to enhance the relia-

bility of results.

• Integrating heuristic optimization techniques that

account for blood type compatibility, rarity

scores, and logistical factors such as distance trav-

eled, in order to improve the blood transfusion ac-

ceptance rate.

• Extending predictive capabilities to forecast a

center’s request acceptance rate over a 10-day

horizon.

• Developing an open-source platform capable of

adapting to real-time ﬂuctuations in blood supply

and demand, facilitating collaboration and seam-

less deployment.

These improvements aim to deliver a more com-

prehensive and practical solution to the challenges

of blood bank management, especially in resource-

limited environments.

4 METHODOLOGY

4.1 Database Schema

To optimize data management and storage within

our blood bank management system, we employed

a Cassandra NoSQL database. The schema design

was carefully structured to represent the key entities

and relationships within the system, enabling efﬁcient

data retrieval and manipulation to support our simula-

tions and predictive models.

Figure 4: Cassandra database schema.

Figure 4 illustrates the database schema, which

consists of four main tables:

1. blood banks: Contains detailed information on

individual blood banks, including contact infor-

mation, geographic location, and relevant opera-

tional data.

2. users: Stores comprehensive data for both donors

and patients, including blood type, contact details,

and associated medical information.

3. blood inventory: Monitors the current inventory

of blood at each bank, tracking quantities, expira-

tion dates, and blood group availability.

4. blood transactions: Logs all blood donation and

request transactions, linking user data, blood bank

operations, and inventory management for a com-

prehensive audit trail.

Our study employed a two-pronged approach to

optimize blood bank management: simulation-based

optimization and shortage prediction modeling.

HEALTHINF 2025 - 18th International Conference on Health Informatics

152

4.2 Simulation-Based Optimization

We developed three progressively advanced simula-

tions to optimize blood bank operations:

1. The initial simulation focused on the fundamental

processes of blood requests and donations. Blood

distribution was performed through random allo-

cation, establishing baseline performance metrics

to serve as a reference for evaluating more ad-

vanced optimization methods.

2. The second simulation incorporated an optimiza-

tion strategy based on geographic proximity, pri-

oritizing blood banks nearest to the point of de-

mand. A blood type compatibility matrix was in-

tegrated, and soon-to-expire blood units were pri-

oritized to reduce wastage and improve resource

utilization efﬁciency.

3. The ﬁnal simulation introduced rarity scores for

various blood types to more effectively manage

scarce resources. This simulation balanced prox-

imity, expiration dates, and rarity in its alloca-

tion decisions, while also tracking expired units

to identify areas for further optimization.

Each simulation ran for a 30-day period, process-

ing 40-50 transactions daily. We measured key per-

formance indicators including acceptance ratio, total

distance traveled, and number of expired resources.

4.3 Shortage Prediction Modeling

To forecast potential shortages, we developed and

evaluated three time-series prediction models for

comparative analysis:

1. The ﬁrst model, linear regression, was imple-

mented as a baseline to capture linear trends in

the acceptance ratios.

2. The second model, Long Short-Term Memory

(LSTM) networks, leveraged deep learning tech-

niques to identify complex temporal patterns. It

was trained on 170 days of historical data to gen-

erate predictive forecasts.

3. Our ﬁnal model, the Autoregressive Integrated

Moving Average (ARIMA), applied traditional

time-series forecasting techniques, effectively ac-

counting for both trends and seasonality in the

data.

All models were trained to forecast the acceptance

ratio for day 180, using data from the preceding 170

days. Model performance was assessed by calculating

the mean absolute percentage error (MAPE) between

the predicted and actual values.

4.4 Performance Metrics

To evaluate the effectiveness of our optimization

strategies and predictive models, we employed the

following metrics:

1. Acceptance Ratio: Proportion of accepted blood

requests.

2. Total Distance Traveled: Sum of distances for all

transactions.

3. Marginal Performance (MP): Relative improve-

ment in acceptance ratio.

4. Average Absolute Percentage Difference: For as-

sessing prediction accuracy.

This comprehensive methodology enabled itera-

tive improvements in blood bank operations through

simulation, while also facilitating the development of

accurate shortage prediction capabilities.

5 PROPOSED WORK

5.1 Dataset Synthesis and Description

The sensitive nature of healthcare data, coupled with

stringent privacy regulations governing blood dona-

tion records, presented considerable challenges in ob-

taining real-world data for this study. Health Informa-

tion Privacy laws, such as the Health Insurance Porta-

bility and Accountability Act (HIPAA) in the United

States, impose strict controls over the use and dissem-

ination of patient health information, including blood

donation records. Additionally, the cross-institutional

structure of blood supply chains, involving multiple

stakeholders, further complicates the process of com-

prehensive data collection due to logistical and legal

constraints.

To address these limitations while maintaining the

integrity of our analysis, we generated a synthetic

dataset designed to replicate real-world blood bank

operations. This approach enables us to explore a

wide range of scenarios and rigorously test our al-

gorithms, without compromising individual privacy

or breaching data protection regulations. Our dataset

comprises four main components:

1. Blood Banks: A dataset comprising 20 simulated

blood banks, each characterized by unique identi-

ﬁers, names, geographic locations (represented by

ZIP codes), and associated contact information.

2. Users: A population of 1,000 individuals, catego-

rized as donors, patients, or both. Each user has a

unique identiﬁer, demographic information, blood

type, and ZIP codes.

Optimizing Blood Transfusions and Predicting Shortages in Resource-Constrained Areas

153

3. Blood Inventory: A comprehensive record of

blood components (Red Blood Cells, Whole

Blood, Platelets, and Plasma) available at each

blood bank. Each inventory entry includes details

such as blood type, quantity, expiration date, and

a unique batch identiﬁer.

4. Blood Transactions: 4200 logs of donation events,

linking donors to speciﬁc inventory batches. Each

transaction includes information on the donor, re-

cipient blood bank, blood component, quantity,

and date of donation.

To generate realistic user data, we employed the

Faker library in Python (Jokiaho and community con-

tributors, 2023), which offers a comprehensive set of

methods for creating plausible synthetic data, such

as names, addresses, phone numbers, and email ad-

dresses. Utilizing the Faker library allowed us to en-

sure that our synthetic dataset closely replicates real-

world data while preserving anonymity and adhering

to privacy standards.

A key aspect of our dataset is the distribution of

blood types, which we based on global averages. Ta-

ble 2 shows the blood type distribution used in our

simulation:

Table 2: Blood type distribution in synthetic dataset.

Blood Type Probability

O+ 0.38

A+ 0.34

B+ 0.09

O- 0.07

A- 0.06

AB+ 0.03

B- 0.02

AB- 0.01

Other key features of our dataset include the fol-

lowing: 1) Varied expiration dates for different blood

components, accurately reﬂecting real-world storage

constraints. 2) Simulated geographical distribution of

blood banks and users, represented using ZIP codes

to mimic real-world spatial dispersion. 3) Consistent

linkage between inventory and donation transactions

through unique batch identiﬁers, ensuring traceability

and integrity of the data.

This synthetic dataset allows us to evaluate and

validate our optimization algorithms under a range of

scenarios, including rare blood type shortages, geo-

graphical constraints, and urgent inventory manage-

ment challenges. While it does not fully capture the

complexities of real-world blood bank operations, it

provides a robust framework for testing and reﬁning

our blood supply chain optimization techniques, serv-

ing as a proof of concept for future development and

more advanced applications.

The synthetic dataset, along with its accompany-

ing schema and documentation, is made available for

use by other researchers to facilitate further explo-

ration and development in the ﬁeld of blood supply

chain optimization (Belfarsi, 2024).

5.2 System Design

Figure 6 illustrates the high-level architecture of

our blood donation optimization system. At the

core of our design is the Apache Cassandra NoSQL

database (Abramova and Bernardino, 2013), chosen

for its ability to handle large-scale, distributed data

with high availability and partition tolerance.

5.2.1 CAP Theorem and Cassandra

The Consistency, Availability and Partition Tolerance

(CAP) theorem, a fundamental principle in distributed

database systems, states that it is impossible for a

distributed data store to simultaneously provide more

than two out of the following three guarantees:

Figure 5: Visual representation of the CAP Theorem.

As illustrated in Figure 5, the three guarantees are:

• Consistency: Every read receives the most recent

write or an error.

• Availability: Every request receives a response,

without guarantee that it contains the most recent

version of the information.

• Partition tolerance: The system continues to op-

erate despite arbitrary partitioning due to network

failures.

5.2.2 Importance of AP in Blood Donation

Systems

The AP characteristics of Cassandra are vital to our

system for several reasons. High availability is cru-

cial, especially in emergencies, where the system

must remain operational. Cassandra’s multi-master

HEALTHINF 2025 - 18th International Conference on Health Informatics

154

Figure 6: System architecture high-level view.

architecture ensures continued functionality even if

some nodes fail, unlike traditional SQL databases,

which rely on a single master and are vulnerable to

downtime in such cases (Yedilkhan et al., 2023). Scal-

ability is another key factor, as blood donation net-

works span large regions, and Cassandra’s horizon-

tal scaling allows seamless growth—something reg-

ular SQL databases struggle with due to their in-

herent vertical scaling limitations. While we trade

strong consistency for eventual consistency, it suits

most scenarios, and tunable consistency levels offer

ﬂexibility when needed. Cassandra’s partition toler-

ance also ensures the system stays operational dur-

ing network issues. Furthermore, its native query-

ing language, CQL, is powerful and efﬁcient, pro-

viding a robust interface for managing complex data

operations, far beyond the capabilities of traditional

SQL for distributed data environments (Perez-Miguel

et al., 2015).

5.2.3 System Components

As shown in Figure 6, our system comprises the fol-

lowing key components:

• Database: Stores data in four main tables: Donor,

BloodBank, Transactions, and Credentials.

• Simulator Module: Acts as the central process-

ing unit, pulling data from Cassandra, instantiat-

ing Donor and BloodBank objects, and simulating

blood donation transactions.

• Data Flow: The simulator pulls data from Cas-

sandra, processes it, writes new transactions, and

updates the database, creating a continuous cycle

of data management.

This architecture enables efﬁcient data manage-

ment, real-time processing of blood donation transac-

tions, and seamless scalability to support the expan-

sion of donor networks and blood banks. By utiliz-

ing Apache Cassandra as our database solution, we

ensure that the system remains highly available and

partition-tolerant—both of which are essential in the

context of time-sensitive, geographically distributed

blood donation systems. These capabilities are criti-

cal for maintaining uninterrupted operations and reli-

able access to blood inventory data, even in the pres-

ence of network disruptions or expanding system de-

mands.

5.3 Implementation Details

5.3.1 Simulation 1: Randomized Transactions

Our ﬁrst simulation model, termed “Randomized

Transactions,” is designed to replicate the dynamic

nature of blood donation and request processes within

a network of blood banks. Implemented in Python,

this simulation leverages the pandas library for efﬁ-

cient data manipulation and processing. The key com-

ponents and processes of this simulation are as fol-

lows:

• Our simulation uses two key data structures: the

BloodBank class, which manages inventory, re-

quests, and expired resources, and Pandas data

frames to store and manipulate data for blood

banks, users, inventory, and transactions, ensuring

efﬁcient data management and analysis.

• The simulation parameters span a 30-day period

from January 1 to January 30, 2023, encompass-

ing all eight major blood types (A+, A-, B+, B-,

Optimizing Blood Transfusions and Predicting Shortages in Resource-Constrained Areas

155

AB+, AB-, O+, O-) and four key blood compo-

nents: Red Blood Cells (RBC), Platelets (PLAT),

Plasma (PLAS), and Whole Blood (WB). Each

blood component is assigned a speciﬁc expiration

period: 42 days for RBC, 365 days for PLAS, 5

days for PLAT, and 35 days for WB. These ex-

piration parameters closely align with real-world

medical guidelines, ensuring the simulation accu-

rately reﬂects the constraints of blood component

storage and usage. (Aubron et al., 2018).

• The simulation tracks two key performance met-

rics—acceptance ratio of blood requests and total

distance traveled for donations and requests.

This randomized simulation serves as a baseline for

analyzing the dynamics of blood supply and demand

within a network of blood banks. It provides founda-

tional insights into the variability and ﬂow of dona-

tions and requests, offering a reference point for eval-

uating more advanced optimization techniques.

5.3.2 Simulation 2: Heuristic Approach with

Proximity and Expiration Prioritization

Building on the foundation of Simulation 1, our sec-

ond simulation model incorporates several heuristic

optimizations to enhance blood allocation efﬁciency

and minimize waste. The key new features introduced

in this model are three heuristic elements to optimize

blood allocation. First, it utilizes proximity-based se-

lection, identifying the closest 15% of blood centers

for each request, thereby reducing transportation time

and enhancing logistical efﬁciency. Second, the sys-

tem incorporates expiration prioritization, ensuring

that blood units nearing their expiration dates are allo-

cated ﬁrst, minimizing waste from expired products.

Lastly, blood type compatibility is managed through

a comprehensive compatibility matrix (refer to Figure

2), implemented as a map data structure for efﬁcient

lookup, allowing for ﬂexible and accurate blood type

matching across the network.

5.3.3 Simulation 3: Heuristic Approach with

Rarity Scores

Simulation 3 extends the functionality of Simula-

tion 2 by incorporating a new heuristic: rarity scores

for blood types. This enhancement further opti-

mizes blood allocation by accounting for the relative

scarcity of each blood type. The rarity score system,

where lower scores represent rarer blood types, is im-

plemented using a map data structure to facilitate efﬁ-

cient lookups and decision-making during allocation.

Incorporating scarcity alongside other key factors en-

hances resource utilization, ensuring more effective

allocation, particularly for rare blood types.

Algorithm 1: Simulation for blood bank management with

expiry prioritization and rarity Scores.

Data: Blood banks, users, inventory data,

blood type compatibility matrix,

expiration rules, rarity scores

Result: Updated transactions and inventory

after simulation

Initialize simulation period, blood types,

component expiration rules, and rarity

scores;

Deﬁne blood type compatibility matrix;

foreach day in the simulation period do

Remove expired blood based on

expiration rules for each bank’s

inventory;

while daily operations not complete do

if Random event is a request then

Find 15% closest blood banks to

the user;

Check inventory using blood

compatibility matrix and

prioritize soon-to-expire

resources;

if Blood bank can fulﬁll request

(considering compatibility,

expiration, and rarity) then

Fulﬁll request and update

inventory;

Track distance, accepted

requests, and selected blood

type based on rarity scores;

else

Deny request and track failed

attempts;

end

else

Select a blood bank for donation;

Add donation to the bank’s

inventory with an expiry date;

Track distance and donation data;

end

Calculate performance metrics (acceptance

ratio, total distance);

Save updated transactions and inventory data;

The rarity scores are inversely correlated with the

prevalence of each blood type in the general popula-

tion. For example, O+ has the highest score due to

its commonality, while AB- has the lowest score, re-

ﬂecting its rarity. This scoring system enables the al-

gorithm to prioritize the conservation of rarer blood

HEALTHINF 2025 - 18th International Conference on Health Informatics

156

Table 3: Blood type rarity scores.

Blood type Rarity score

O+ 4

A+ 3

O- 3

B+ 2

A- 2

B- 1

AB+ 1

AB- 1

types during allocation decisions, ensuring more efﬁ-

cient management of scarce resources.

This simulation retains the proximity-based selec-

tion, expiration prioritization, and blood type compat-

ibility matrix from Simulation 2 (refer to Figure 2 for

the compatibility matrix). The integration of rarity

scores with these existing heuristics in Algorithm 1

creates a more comprehensive approach to blood bank

management, potentially offering insights into strate-

gies for managing blood inventories more effectively,

especially for rare blood types.

5.3.4 Shortage Prediction Techniques

To further enhance our blood bank management sys-

tem, we implemented a shortage prediction module

designed to forecast the acceptance ratio of blood re-

quests for each blood bank up to 10 days in advance.

This predictive capability provides valuable insights,

enabling proactive donation efforts and more efﬁcient

resource planning.

We collected acceptance ratio data for each blood

bank over a period of 180 days. This time series data

forms the basis of our prediction models. Figure 7

illustrates the variation in acceptance ratios across all

blood banks over this period.

Figure 7: Acceptance ratio changes for blood banks over

180 days.

We implemented and compared three distinct

time series prediction models, each offering unique

strengths in processing time-dependent data:

1. Long Short-Term Memory (LSTM) Networks:

A type of recurrent neural network capable of

learning long-term dependencies, well-suited for

capturing complex patterns in time series data.

2. Linear Regression: A simple yet effective ap-

proach for modeling linear trends in time series,

serving as a baseline for comparison with more

complex models.

3. Autoregressive Integrated Moving Average

(ARIMA): Combines autoregression, differenc-

ing, and moving average components, effective

for capturing various temporal structures in the

data.

Each model was trained on the 170-day historical

data and tasked with predicting the acceptance ratio

10 days into the future for each blood bank.

6 RESULTS AND DISCUSSION

6.1 Database Query Samples

To demonstrate the practical application of a NoSQL

database within the blood donation management sys-

tem, we present two sample queries written in Cas-

sandra Query Language (CQL) along with their cor-

responding results.

Figure 8: Query 1 to retrieve blood inventory for a speciﬁc

bank.

Figure 8 illustrates a query that retrieves the com-

plete blood inventory for a speciﬁc blood bank (bank

ID = 7). This query efﬁciently accesses detailed

inventory information, including blood types, com-

ponents, batch IDs, expiration dates, and quantities.

Such queries are essential for our optimization al-

gorithms, enabling precise assessment of blood unit

availability at a given location.

Figure 9 presents a more complex query that ag-

gregates the total quantity of O+ whole blood across

multiple blood banks. This type of query is especially

valuable for our shortage prediction models, enabling

analysis of the distribution and availability of speciﬁc

blood types and components across various locations.

These query examples demonstrate the ﬂexibility

and power of our database schema in supporting both

Optimizing Blood Transfusions and Predicting Shortages in Resource-Constrained Areas

157

Figure 9: Query 2 to aggregate O+ whole blood quantities

across banks.

detailed inventory management and high-level analyt-

ics. The ability to efﬁciently retrieve and aggregate

data in this manner is fundamental to the performance

of our optimization and prediction algorithms.

We conducted three simulations to evaluate the

performance of our blood bank management system

under different optimization strategies. This section

presents the results and analyzes the improvements

achieved with each iteration.

6.2 Simulation Results

Table 4 summarizes the key performance metrics for

each simulation. We will analyze each of those.

Table 4: Performance metrics across the three simulations.

Metric Randomized Heuristic 1 Heuristic 2

Total Accepted Requests 553 679 674

Total Denied Requests 147 123 84

Overall Acceptance Ratio 0.79 0.85 0.89

Total Units Traveled 2,496,171 1,573,463 1,551,953

6.2.1 Acceptance Ratio Analysis

The acceptance ratio demonstrated consistent im-

provement across the three simulations, indicating the

effectiveness of the progressively reﬁned heuristics.

In the ﬁrst comparison, moving from Simulation 1

to Simulation 2, we observed an increase from 79% to

85%, a change that was statistically signiﬁcant with a

z-score of 2.79 and a p-value of 0.0026. This marked

an improvement in optimizing blood allocation.

The second comparison, between Simulation 2

and Simulation 3, showed further gains, with the ac-

ceptance ratio rising from 85% to 89%. This improve-

ment, although smaller, remained statistically signiﬁ-

cant, with a z-score of 2.12 and a p-value of 0.0170.

Finally, the cumulative effect from Simulation 1 to

Simulation 3 was even more striking, with the accep-

tance ratio increasing from 79% to 89%. This overall

improvement was highly signiﬁcant, with a z-score of

4.90 and a p-value of less than 0.00001, underscor-

ing the substantial impact of incorporating additional

heuristic elements.

6.2.2 Distance Optimization

The total distance traveled decreased signiﬁcantly:

• Sim 1 to Sim 2: Reduced by 922,708 (36.96%

reduction)

• Sim 2 to Sim 3: Further reduced by 21,510

(1.37% reduction)

• Overall reduction (Sim 1 to Sim 3): 944,218

(37.83% reduction)

The distance reduction analysis reveals that the

ﬁrst heuristic introduced in Simulation 2 was the

most effective in minimizing the distance traveled, ac-

counting for nearly all of the overall gains (36.96%).

The second heuristic, while contributing sightly, re-

sulted in diminishing returns with only 1.37% addi-

tional reduction. Therefore, most of the optimization

occurred during the ﬁrst transition, indicating that fur-

ther improvements in distance reduction may require

more advanced or different heuritics beyong those ap-

plied in Simulation 3.

6.2.3 Marginal Performance (MP) Analysis

To quantify the incremental beneﬁts of each simula-

tion, we calculate the Marginal Performance (MP) us-

ing equation 1:

∆MP =

Acceptance rate − Baseline acceptance rate

1 − Baseline acceptance rate

(1)

Table 5 presents the results of this analysis:

Table 5: Marginal performance analysis.

Transition ∆MP (Accept ratio) ∆MP (Distance reduced)

S1 to S2 28.8% 37%

S2 to S3 26.7% 1.4%

S1 to S3 47.6% 37.8%

The MP analysis demonstrates that the largest im-

provements in both acceptance ratio and distance trav-

eled occur during the transition from S1 to S2. While

acceptance ratio continues to improve in the transition

S2 to S3, the gains in distance reduction are minimal,

suggesting diminishing returns in this aspect. Over-

all, the analysis shows that applying heuristic meth-

ods substantially improves performance, with nearly

48% improvement in acceptance rate and a 38% re-

duction in total distance traveled when comparing the

ﬁnal simulation to the baseline.

HEALTHINF 2025 - 18th International Conference on Health Informatics

158

6.3 Shortage Prediction Results

We evaluated three predictive models—Long

Short-Term Memory (LSTM), Linear Regression,

and Autoregressive Integrated Moving Average

(ARIMA)—for forecasting the acceptance ratio of

blood banks. Each model was trained on 170 days

of historical data and tasked with predicting the

acceptance ratio for day 180. The results of each

model’s performance across 20 blood banks are

presented below.

Table 6 summarizes the performance of each

model:

Table 6: Model performance comparison.

Model Mean percent difference

LSTM 1.48%

Linear Regression 1.40%

ARIMA 1.83%

Based on the mean percent difference, both the

LSTM and Linear Regression models performed sim-

ilarly well, with LSTM having a 1.48% difference and

Linear Regression slightly better at 1.40%. ARIMA,

while still performing reasonably, had a higher mean

percent difference of 1.83%. Overall, the results sug-

gest that Linear Regression and LSTM are the most

accurate models for predicting the acceptance ratio,

with Linear Regression showing a slight edge in per-

formance over the other two models.

Tables 7, 8, and 9 show the detailed results for

each model. Due to space constraints, we present re-

sults for a subset of blood banks.

Table 7: LSTM model results subset.

Bank ID Predicted Actual Percent difference

1 0.7816 0.7757 0.7648%

5 0.8771 0.8626 1.6785%

10 0.8184 0.8063 1.5101%

15 0.6610 0.6233 6.0379%

20 0.9465 0.9302 1.7483%

Table 8: Linear regression model results subset.

Bank ID Predicted Actual Percent difference

1 0.7807 0.7731 0.9736%

5 0.8801 0.8634 1.9322%

10 0.8205 0.8063 1.7704%

15 0.6349 0.6201 2.3944%

20 0.9390 0.9302 0.9380%

All three models demonstrate strong performance,

with average absolute percentage differences ranging

from 1.40% to 1.83%. Among them, Linear Regres-

sion ranked the highest with 1.40% difference, fol-

Table 9: ARIMA model results subset.

Bank ID Predicted Actual Percent difference

1 0.7799 0.7731 0.8683%

5 0.8801 0.8634 1.9407%

10 0.8219 0.8063 1.9428%

15 0.6535 0.6201 5.3974%

20 0.9380 0.9302 0.8306%

lowed by LSTM at 1.48%, while ARIMA had the

highest difference at 1.83%.

Despite their overall accuracy, all models

struggled with speciﬁc outliers, such as Bank 15,

indicating that certain blood banks may have more

stochastic acceptance ratios. The ﬁndings suggest

that simpler models like Linear Regression provide

a good balance between accuracy and computational

efﬁciency for this task. The results demonstrate the

feasibility of using predictive models to forecast

acceptance ratios, aiding resource allocation and

shortage prevention in blood banks.

7 CONCLUSIONS AND FUTURE

WORK

In conclusion, our approach demonstrates signiﬁcant

improvements in blood supply management through

the introduction of additional heuristic matching cri-

teria such as proximity, blood compatibility matrix,

rarity scores, and expiration dates. These enhance-

ments led to a 38% reduction in the total distance trav-

eled, which is crucial in emergencies where timely

delivery is essential. Furthermore, our multi-level

heuristic matching strategy resulted in a 48% im-

provement in matching performance ∆MP, highlight-

ing the potential for more efﬁcient resource alloca-

tion.

Predicting blood transfusion acceptance rates over

a 10-day window also played a vital role in optimiz-

ing blood supply usage. Our evaluation shows that

while linear regression, LSTM, and ARIMA models

perform similarly in terms of accuracy, however linear

regression stands out as a simpler statistical predic-

tive approach, offering comparable results with lower

computational complexity.

For future work, we intend to explore the devel-

opment of mobile and web applications designed to

maximize user engagement, focusing on both donor

participation and improving the user experience for

patients, especially those requiring frequent transfu-

sions (Li et al., 2023). Additionally, we aim to cre-

ate an integrated model that combines shortage pre-

diction with heuristic matching. This combined ap-

Optimizing Blood Transfusions and Predicting Shortages in Resource-Constrained Areas

159

proach will be thoroughly assessed to determine its

impact on acceptance rates, comparing it to our cur-

rent system to further enhance efﬁciency and effec-

tiveness in blood supply management.

REFERENCES

Abramova, V. and Bernardino, J. (2013). Nosql databases:

Mongodb vs cassandra. In Proceedings of the inter-

national C* conference on computer science and soft-

ware engineering, pages 14–22.

American Society of Hematology (2020). Blood services

around the globe. ASH Clinical News.

Aubron, C., Flint, A. W., Ozier, Y., and McQuilten, Z.

(2018). Platelet storage duration and its clinical and

transfusion outcomes: a systematic review. Critical

Care, 22:1–13.

Barnes, L. S., Stanley, J., Bloch, E. M., Pagano, M. B., Ipe,

T. S., Eichbaum, Q., Wendel, S., Indrikovs, A., Cai,

W., and Delaney, M. (2022). Status of hospital-based

blood transfusion services in low-income and middle-

income countries: a cross-sectional international sur-

vey. Bmj Open, 12(2):e055017.

BCS (2022). Blood type compatibility: Which blood types

are compatible with each other? Accessed: 2024-10-

21.

Belfarsi, E. A. (2024). Blood donation/transfusion system.

Accessed: 2024-12-06.

Ben Elmir, W., Hemmak, A., and Senouci, B. (2023). Smart

platform for data blood bank management: forecast-

ing demand in blood supply chain using machine

learning. Information, 14(1):31.

Farrington, J. M., Utley, M., Alimam, S., Wong, W. K.,

and Li, K. (2023). Deep reinforcement learning for

managing platelets in a hospital blood bank. Blood,

142:2311.

Jokiaho, J. and community contributors (2023). Faker:

Python Package for Generating Fake Data. Version

15.3.3.

Kralievits, K. E., Raykar, N. P., Greenberg, S. L., and

Meara, J. G. (2015). The global blood supply: a liter-

ature review. The Lancet, 385:S28.

Laermans, J., O, D., Van den Bosch, E., De Buck, E.,

Compernolle, V., Shinar, E., and Vandekerckhove, P.

(2022). Impact of disasters on blood donation rates

and blood safety: A systematic review and meta-

analysis. Vox Sanguinis, 117(6):769–779.

Li, L., Valero, M., Keyser, R., Ukuku, A. M., and Zheng,

D. (2023). Mobile applications for encouraging blood

donation: A systematic review and case study. Digital

Health, 9:20552076231203603.

Perez-Miguel, C., Mendiburu, A., and Miguel-Alonso, J.

(2015). Modeling the availability of cassandra. Jour-

nal of Parallel and Distributed Computing, 86:29–44.

Petersen, J. and Jhala, D. (2023). Blood shortage moni-

toring: Comparison of the supply situation during the

omicron surge and at the tail end of the covid-19 pan-

demic this year. American Journal of Clinical Pathol-

ogy, 160(Supplement 1):S112–S113.

Raykar, N. P., Kralievits, K., Greenberg, S. L., Gillies,

R. D., Roy, N., and Meara, J. G. (2015). The blood

drought in context. The Lancet Global Health, 3:S4–

S5.

Riley, W., Love, K., and McCullough, J. (2021). Public pol-

icy impact of the covid-19 pandemic on blood supply

in the united states. American journal of public health,

111(5):860–866.

Roberts, N., James, S., Delaney, M., and Fitzmaurice, C.

(2019). The global need and availability of blood

products: a modelling study. The Lancet Haematol-

ogy, 6(12):e606–e615.

Sandaruwan, P., Dolapihilla, U., Karunathilaka, D., Wi-

jayaweera, W., Rankothge, W., and Gamage, N.

(2020). Towards an efﬁcient and secure blood bank

management system. In 2020 IEEE 8th R10 Humani-

tarian Technology Conference (R10-HTC), pages 1–6.

IEEE.

Sarhan, M. A., Saleh, K. A., and Bin-Dajem, S. M. (2009).

Distribution of abo blood groups and rhesus factor in

southwest saudi arabia. Saudi Med J, 30(1):116–119.

Shih, H. and Rajendran, S. (2019). Comparison of

time series methods and machine learning algorithms

for forecasting taiwan blood services foundation’s

blood supply. Journal of healthcare engineering,

2019(1):6123745.

Talie, E., Wondiye, H., Kassie, N., and Gutema, H. (2020).

Voluntary blood donation among bahir dar university

students: application of integrated behavioral model,

bahir dar, northwest ethiopia, 2020. Journal of Blood

Medicine, pages 429–437.

Van Denakker, T. A., Al-Riyami, A. Z., Feghali, R., Gam-

mon, R., So-Osman, C., Crowe, E. P., Goel, R., Rai,

H., Tobian, A. A., and Bloch, E. M. (2023). Manag-

ing blood supplies during natural disasters, humanitar-

ian emergencies, and pandemics: lessons learned from

covid-19. Expert review of hematology, 16(7):501–

514.

Wijayathilaka, P., Gamage, P. P., De Silva, K., Athuko-

rala, A., Kahandawaarachchi, K., and Pulasinghe, K.

(2020). Secured, intelligent blood and organ donation

management system-“lifeshare”. In 2020 2nd Inter-

national Conference on Advancements in Computing

(ICAC), volume 1, pages 374–379. IEEE.

World Health Organization (2022). Blood transfusion: facts

in pictures. World Health Organization.

World Health Organization (2023). Blood safety and avail-

ability. Accessed: 2024-10-21.

Yedilkhan, D., Mukasheva, A., Bissengaliyeva, D., and

Suynullayev, Y. (2023). Performance analysis of scal-

ing nosql vs sql: A comparative study of mongodb,

cassandra, and postgresql. In 2023 IEEE International

Conference on Smart Information Systems and Tech-

nologies (SIST), pages 479–483. IEEE.

HEALTHINF 2025 - 18th International Conference on Health Informatics

160