Holistic Cyber Threat Modeling for Machine Learning-Based Systems: A

Case Study in Healthcare

Janno Jaal

1,2 a

and Hayretdin Bahsi

1,3 b

Department of Software Science, Tallinn University of Technology, Tallinn, Estonia

Cybernetica AS, Tallinn, Estonia

School of Informatics, Computing, and Cyber Systems, Northern Arizona University, U.S.A.

{jajaal, hayretdin.bahsi}@taltech.ee

Keywords:

Healthcare, Machine Learning, Adversarial Attacks, Cyber Threats, Threat Modeling.

Abstract:

Considering the immense pace in machine learning (ML) technology and related products, it may be difﬁcult

to imagine a software system, including healthcare systems, without any subsystem containing an ML model

in the near future. However, ensuring the resiliency of these ML-based systems against cyber attacks is vital for

more seamless and widespread technology usage. The secure-by-design principle, considering security from

the early stages of development, is a cornerstone to achieving sufﬁcient security at a reasonable cost. The

realization of this principle starts with conducting threat modeling to understand the relevant security posture

and identify cyber security requirements before system design. Although threat modeling of software systems

is widely known, it is unclear how to apply it to software systems with machine learning models. Although

adversarial machine learning is a widely studied research topic, it has yet to be thoroughly researched how

adversarial and conventional cybersecurity attacks can be holistically considered to identify applicable cyber

threats at the early stage of a software development life cycle. This paper adapts STRIDE, a widely-known

threat modeling method, for the holistic cyber threat analysis of an ML-based healthcare system.

1 INTRODUCTION

Healthcare systems are responsible for a wide range

of functions and services to promote and maintain the

health of individuals and communities. An enormous

amount of health data is generated through electronic

health records, imaging, sensor data, and text (Min

et al., 2017). Thus, these systems have beneﬁted

from the rapid progress in machine learning (ML),

and as a result, healthcare has become one of the

early adopters of this technology. For example, ML

applications equipped with Internet of Things (IoT)

solutions collect vast amounts of data through remote

monitoring devices, send them to the cloud and run

ML models on these data to enhance the management

of diagnosis and treatment efforts (Kakhi et al., 2022).

Health applications are highly susceptible to ma-

licious cyber actions, including unauthorized access,

theft or manipulation of medical records, malware in-

fections, or denial-of-service attacks. These cyber

threats can lead to severe consequences, ranging from

https://orcid.org/0009-0001-6122-250X

https://orcid.org/0000-0001-8882-4095

signiﬁcant safety risks to patients to disruption of

critical healthcare services and substantial economic

losses.

Attackers typically compromise their target sys-

tems by exploiting applications, operating systems,

or network device vulnerabilities. These vulnerabil-

ities often arise from inadequate secure development

practices, which can be prevented by applying secure-

by-design principles. These principles consider se-

curity throughout the development life cycle, from

requirement analysis to implementation and mainte-

nance. As the vulnerabilities are identiﬁed and miti-

gated in the earliest possible stages, secure-by-design

reduces the cost of security-related tasks.

Threat modeling is a critical analysis task, typi-

cally done in the early stages, to identify the attack

surfaces and applicable cyber threats to the target sys-

tem (Xiong and Lagerstr

om, 2019). It starts with

modeling the target and systematically elaborating on

cyber threats. Attack taxonomies obtained from secu-

rity frameworks and the views of experts (e.g., soft-

ware developers and system architects) participating

in the study shape the analysis.

ML models have increasingly been deployed into

Jaal, J. and Bahsi, H.

Holistic Cyber Threat Modeling for Machine Learning-Based Systems: A Case Study in Healthcare.

DOI: 10.5220/0013372300003911

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 18th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2025) - Volume 2: HEALTHINF, pages 319-329

ISBN: 978-989-758-731-3; ISSN: 2184-4305

319

software systems developed in-house or procured as

a COTS product. It is also possible to use cloud-

based ML Services. A considerable body of research

has been conducted on adversarial attacks that focus

on manipulating datasets and ML methods (Vassilev

et al., 2024). On the other side, the system assets

where ML models are deployed can be compromised

by conventional cyber security attacks (e.g., steal-

ing the model ﬁle via network service vulnerability).

These cyber attacks may be a precondition for adver-

sarial attacks in most attack campaigns. Although the

system defenders must have a holistic view regard-

ing both attack categories, a considerable gap exists

between machine learning and cyber security experts

and their security practices (Apruzzese et al., 2023).

Threat modeling methods in cyber security were

ﬁrst developed and applied to software systems

(Shostack, 2014) but were later extended for other

systems (e.g., industrial control systems(Khalil et al.,

2023)). However, the adaption of these methods in

ML-based software systems has not been explored

sufﬁciently due to a lack of coherent integration of

software engineering, machine learning and cyber

security disciplines in this problem area. Current

studies (Wilhjelm and Younis, 2020; Ali Alatwi and

Morisset, 2022; Mauri and Damiani, 2022) do not

propose a proper system modeling nor demonstrate

how adversarial and conventional cyber attacks can

be reconciled for a holistic threat analysis.

This paper demonstrates how a widely utilized

cyber threat modeling approach, STRIDE, can be

adapted to a system with machine learning-based

components. More speciﬁcally, in a healthcare sys-

tem case study, we ﬁrst demonstrate how ML-related

assets can be represented in a data ﬂow diagram

(DFD), constituting a system model notation for

STRIDE. Then, we identify the security boundaries

and systematically elicit the cyber threats applica-

ble to the target system. We put particular emphasis

on covering both adversarial and conventional cyber

threats during the elicitation.

We assume the system owner has in-house ca-

pabilities to manage the entire ML life cycle, en-

compassing stages such as data engineering, model

development, and model operation. Consequently,

our study provides comprehensive coverage of ML-

related activities and systems. The unique contribu-

tion of our paper is the in-depth demonstration of

system modeling for ML system assets and system-

atic threat elicitation with a holistic view of machine

learning and cyber security disciplines.

The content of the paper is as follows: Section

2 reviews the relevant literature. Section 3 presents

the methods followed in this study. Section 4 gives

the case study results. Section 6 discusses our main

ﬁndings. Section 7 concludes our paper.

2 RELATED WORK

A few studies have applied STRIDE to the cyber

threat modeling of ML-based systems (Wilhjelm and

Younis, 2020; Ali Alatwi and Morisset, 2022; Mauri

and Damiani, 2022). Wilhjelm and Younis presented

a DFD in which the ML model is obtained from a

third party. This study adopts an attack taxonomy for

systematic threat elicitation. Although it provides a

comprehensive proposal that includes a ranking and

mitigation of the threats, the system modeling does

not cover the whole ML life cycle, and the reasoning

behind DFD choices is not discussed. More specif-

ically, the study represents the ML model as a pro-

cess without separating the application software and

model repository, which may not be granular enough

to elaborate on speciﬁc threats. The study elicits only

adversarial-related threats.

Threat modeling of an intrusion detection sys-

tem is conducted in (Ali Alatwi and Morisset, 2022).

Although this study addresses both adversarial and

conventional threats, it utilizes two separate model-

ing frameworks, attack trees for adversarial ones and

STRIDE for conventional ones. Attack trees are pow-

erful in demonstrating the attack scenarios. However,

they do not provide instruments for system model-

ing and systematic threat elicitation. In this study,

the representation of the operational system is weak.

The ML model is represented by an entity, usually

assigned to external actors. The applicable threats

are limited for entities in STRIDE. Another study

follows a similar approach with the same limitation

(Cagnazzo et al., 2018).

A threat modeling framework that uses the Fail-

ure Mode and Effects Analysis (FMEA) is applied to

an energy grid system that has an ML-based system

(Mauri and Damiani, 2022). FMEA, which is derived

from the safety domain, identiﬁes the potential failure

modes of a product or system and then determines the

risks. This study uses the threat categories of STRIDE

to classify the ﬁndings and attack trees of such cate-

gories for threat elicitation. The study uses a non-

standard notation of DFD, and it is unclear how the

threat elicitation is linked to the DFD. FMEA is a top-

down approach that ﬁrst identiﬁes the failure modes

and then the reasons causing them. Its implementa-

tion is complex, especially for IT and cyber security

professionals unfamiliar with safety concepts.

Some studies focus on threat modeling in health-

care without addressing ML-based systems. Threat

HEALTHINF 2025 - 18th International Conference on Health Informatics

320

models, including STRIDE and LINDDUN, are ap-

plied and compared for systems that process elec-

tronic health records (Holik et al., 2023). Another

study conducts a device-level threat modeling for

Miniaturized Wireless Biomedical devices (Vakhter

et al., 2022).

As a research gap, we identify that cyber threat

modeling studies addressing ML-based systems do

not provide in-depth guidance about system modeling

choices. They do not address conventional and adver-

sarial threats within a uniﬁed modeling framework.

3 METHODS

Threat modeling starts with identifying the security

objectives that can be derived from organizational

policies, standards, regulations, and legal require-

ments (Khalil et al., 2023). The second stage can be

named system modeling or system mapping, in which

the target system is identiﬁed with the necessary de-

tails (Khalil et al., 2023). As threat modeling is con-

ducted at early stages in development life cycles, the

abstraction level in system modeling may depend on

the available information. However, the main system

assets and relevant data ﬂows constitute the system

modeling. Then, the threat elicitation stage starts. A

systematic approach that suits the system model com-

ponents should be followed to achieve optimal cov-

erage of the applicable threats. Although a complete

threat modeling consists of stages such as impact/risk

assessment and identiﬁcation of mitigations (Khalil

et al., 2023), these stages are out of scope in this pa-

per as our focus is threat identiﬁcation for ML-based

system components. In this paper, we followed the

STRIDE method developed by Microsoft, which is

widely known and utilized by practitioners.

3.1 Security Objectives

Our study aimed to identify cyber threats to a health-

care system with ML-based assets. We mainly cov-

ered two threat types: (1) Well-known cyber threats

that address the conﬁdentiality, integrity, and avail-

ability of the systems (i.e., we named them conven-

tional threats), such as malware, denial of service at-

tacks, MiTM attacks, or unauthorized access. (2)

Adversarial ML threats speciﬁcally target ML mod-

els, algorithms, or training /validation data (Papernot

et al., 2018). They may include threats such as poi-

soning, evasion, or inference attacks (Vassilev et al.,

2024).

An attacker can compromise a model repository

by using a network service vulnerability and then steal

the model ﬁle to violate intellectual property, which

can be categorized as a conventional threat. On the

other hand, an attacker can query the model several

times and steal it via a model extraction attack (Chan-

drasekaran et al., 2020). This threat is categorized as

adversarial. However, in various situations, attack-

ers can only launch adversarial attacks if they fulﬁl

some preconditions via conventional ones. For in-

stance, querying the model several times in a model

extraction case may require the attacker to bypass

some limits enforced by network or application-level

access controls, which can be typically done via con-

ventional threats.

It is important to note that this study does not

focus on privacy threat modeling. Although secu-

rity threat modeling allows us to identify various pri-

vacy violations once the assets and data ﬂows re-

garding sensitive personal data are identiﬁed, it still

does not provide a complete privacy analysis. Privacy

threat models have a more comprehensive approach

to collecting, processing, and sharing personal data,

which can manifest in identifying more speciﬁc pri-

vacy threats, such as linkability or detectability (Deng

et al., 2011).

3.2 System Modeling

ML-based components can be incorporated into the

IT systems mainly in three ways (Estonian Informa-

tion Systems Authority, 2024): (1) An external ML

service that is created and maintained by a third-

party cloud provider or other companies and accessed

through APIs, (2) A pretrained or customized model

obtained from other sources or companies can be de-

ployed into the system. Such models can come within

a speciﬁc COTS product (e.g., medical imaging or di-

agnostic). (3) The model is developed and deployed

within an ML development life-cycle that entirely

runs in the organization (in-house development)

This study considers the third option, which

requires representing the whole ML life cycle in the

system model. STRIDE uses a Data Flow Diagram

(DFD), a semi-formal representation with speciﬁc

notation, for system modeling of software systems.

The notation classiﬁes each system component into

a DFD element, such as a process, entity, data store,

or data ﬂow. A process usually represents software

components performing data processing, such as

application servers, microservices, or authentication

servers. Entities refer to external parties, such as

users or external services. Data store notation is

typically used for databases or other data storage

forms, whereas data ﬂows characterize the communi-

cations between system components. In STRIDE, the

Holistic Cyber Threat Modeling for Machine Learning-Based Systems: A Case Study in Healthcare

321

applicable threats to each component are mapped as

given in Table 1. As this mapping is a signiﬁcant part

of the systematic threat elicitation, selecting the most

appropriate DFD element notation for ML-based

assets is essential.

Table 1: Applicable Threats to DFD Elements (Shostack,

2014).

DFD Element S T R I D E

Entity ✓ ✓

Data Flow ✓ ✓ ✓

Data Store ✓ ✓ ✓ ✓

Process ✓ ✓ ✓ ✓ ✓ ✓

We reviewed the academic and grey literature to

identify how ML-based assets and ML-life cycles are

represented under this notation and concluded that no

common understanding is applied in the case studies,

and the life cycle needs to be covered comprehen-

sively.

Cagnazzo et al. propose the simplest presenta-

tion - the AI/ML Model is just displayed as one ex-

ternal entity (Cagnazzo et al., 2018), which prevents

the elicitation of signiﬁcant threats (e.g., tampering

and information disclosure). Although Alatwi et al.

introduce processes for representing data preprocess-

ing and model training stages in the DFD of the target

system, an entity is chosen for the model deployed

into the operational systems (Ali Alatwi and Moris-

set, 2022).

Based on the machine learning operations

(MLOps) principles, the life cycle comprises three

stages: design, model development and operations

(Dr. Larysa Visengeriyeva, 2023). The design stage

deﬁned in this study covers data science activities

such as gathering requirements and checking data

availability. We excluded this stage in our DFD be-

cause these activities can be fulﬁlled with a series

of human-expert actions rather than well-deﬁned data

ﬂows. However, model development and operation

stages can be represented in DFD notations. The op-

eration stage includes model deployment, monitor-

ing and maintenance. In ENISA’s report (European

Union Agency for Cybersecurity (ENISA), 2020), the

model development is divided into two steps: model

training and model tuning.

In our DFD, one of our critical decisions is to rep-

resent the model registry, which stores the models,

including their versions and metadata (e.g., hyperpa-

rameters, update times), as a data store. We consider

an entity inconvenient as it is utilized for external ac-

tors, and applicable threats are only spooﬁng and non-

repudiation. We characterize the model registry as a

data store rather than a process because model reg-

istries do not typically initiate a connection but re-

spond to queries of other system components without

interacting with end users, similar to a database.

We represent the life-cycle steps, data processing,

model development, and model operation in the DFD.

It is important to note that each step can be repre-

sented with varying granularity levels. We divided

model development into two detailed steps, model

training, and model tuning, as these functions may be

performed by two distinct teams and infrastructures,

depending on the organization’s scale. We create a

process for each step (i.e., data processing, model

training and model tuning).

Performance monitoring is an essential function

in the model operation stage. In a real-world setting,

an unexpected decrease may occur in model perfor-

mance due to several reasons, such as concept drifts

(i.e., deviations in the source data distribution and,

thus, decision boundary) or functional errors intro-

duced in the software updates. Thus, performance

monitoring checks the accuracy of the model deci-

sions regularly. In our DFD, the operation stage com-

prises a data store representing the model registry

and a process embodying the performance monitor-

ing function. We identiﬁed the data ﬂows between

these stages. The details about the proposed DFD are

given in Section 4.1.

3.3 Security Boundaries

In STRIDE, security boundaries are drawn accord-

ing to the attack surface. More speciﬁcally, poten-

tial malicious actors’ accessibility of system assets

determines the security boundaries. For instance, if

it is assumed that such actors may physically pene-

trate the area where wireless communication occurs

between the controller and mobile application, then

a security boundary can be drawn between those two

assets. Any data ﬂow crossing the boundary and the

endpoints of such a ﬂow, whether an entity, process or

data store, is considered in the threat elicitation. DFD

elements within the same boundary can be considered

trusted and excluded from the elicitation. The details

of the security boundary decisions made in this study

are presented in Section 4.2.

3.4 Threat Elicitation

In STRIDE, threat elicitation can be conducted using

two approaches: STRIDE-per-Element or STRIDE-

per-Interaction (Shostack, 2014). The former tra-

verses each element during the elicitation stage,

which has a broader coverage of the threat landscape

and is suitable for small or medium-scale systems.

The latter only enumerates the data ﬂows crossing

HEALTHINF 2025 - 18th International Conference on Health Informatics

322

the security boundaries, which is more applicable to

large-scale systems. We follow the former as our tar-

get system does not have a vast number of system as-

sets, and that approach enables us to elaborate on each

asset based on its role in ML life cycles.

In this stage, we identiﬁed conventional and ad-

versarial ML threats that can apply to each DFD ele-

ment to achieve a holistic view. It is important to note

that an element that does not interact with another ele-

ment belonging to a different security boundary is ex-

cluded during the elicitation stage. This means such

an element resides in a trusted zone and is assumed

to be secure. STRIDE provides an attack tree for the

pairs of an entity and threat type (Shostack, 2014).

For instance, a tampering threat against a process has

an attack tree, demonstrating the possible attack sce-

narios within the given threat context. Although these

attack trees were developed for software systems, we

used them while brainstorming about the applicabil-

ity of the threat and still found them helpful for threat

elicitation for ML-based assets. It is important to note

that such trees are less known and rarely applied in

research studies regarding threat modeling. Due to

the page limit in this paper, we do not provide a de-

tailed discussion of the threats within the framework

of corresponding attack trees. The identiﬁed threats

are given and discussed in Section 4.3.

4 RESULTS

This section provides the results of each threat mod-

eling step introduced in Section 3. As the necessary

details about the security objectives are given in Sec-

tion 3.1, we exclude that stage.

4.1 System Modeling

The DFD given in Figure 1 demonstrates the whole

target system. Two external entities, ten processes,

two data stores and twenty-ﬁve data ﬂows are identi-

ﬁed. Patients and doctors (i.e., doctors representing

any medical staff) are considered entities. The remote

monitoring system on the patient’s network consists

of ﬁve processes: three for sensors with varying

functions (i.e., blood sugar, heart rate, SpO2), one for

sensor controller, and one for smartphone applica-

tion. The patient entity interacts with the smartphone

application to read the data obtained from sensors and

make necessary conﬁguration changes to the sensors

via the sensor controller. The sensor controller pro-

cess collects sensor readings from sensors, processes,

aggregates and relays them to the central healthcare

system (CHS) process at the medical institution’s net-

work (i.e., named medical network from now

on) via the smartphone application.

The CHS process acts as a core system compo-

nent in the medical network. It interacts with the

central database, represented as a data store, to query

or store electronic medical records (EMRs). Patients

can query these records via a smartphone application,

whereas medical staff query them via their PCs or

smartphones. CHS process is also the main interface

with the ML-based system assets, so it provides raw

data to the ML development life cycle, queries the

model registry and receives the prediction results to

utilize ML support in the medical network. The med-

ical staff can use these predictions for better decision-

making and for the patients to get recommendations

or help.

A similar mobile health application architecture is

given in (Latif et al., 2017).

Among the identiﬁed system assets, four pro-

cesses, one data store, and nine data ﬂows can be con-

sidered ML-based system assets. As our primary pur-

pose in this study is to explore ML life cycles, we fo-

cus on these system assets and corresponding threats

from now on.

Four processes—data engineering, model train-

ing, model tuning, and performance monitor-

ing—represent the model development and model op-

eration steps in the ML life cycle.

Data Engineering. This step, represented as one

process, aims to analyse, clean and preprocess the

raw data (e.g., removing irrelevant data rows or la-

beling data) to make it suitable for the model devel-

opment phase. Feature engineering, identifying in-

formative features and extracting new ones can be

done to improve the model performance. This step

can be investigated at a more ﬁne-grained level by di-

viding the process into separate sub-processes (e.g.,

event-stream processing, data integration, data visual-

ization) and adding a data store element representing

a database (e.g., data lake) for raw data or extracted

features. The data engineering platforms can interac-

tively communicate with human experts to label the

data, creating additional attack surfaces. However, in

this study, we focus on the main incoming and out-

going data ﬂows of data engineering tasks to simplify

the case study and represent the whole step as one

process.

As shown in Figure 1, the data engineering pro-

cess receives the data ﬂow named raw data from the

central healthcare system and delivers labeled data,

represented as another data ﬂow, to the model train-

ing process.

Holistic Cyber Threat Modeling for Machine Learning-Based Systems: A Case Study in Healthcare

323

Figure 1: Data Flow Diagram.

Model Training & Model Tuning. We address

model development through two sequential pro-

cesses: model training and model tuning. The model

development step aims to create, test, and deliver a

stable model to an operational setting. An initial

model is created by utilizing a ratio of labeled data

as training data in the model training process. This

process typically benchmarks various ML methods to

minimize the classiﬁcation error on validation data.

The model tuning process obtains the created model,

optimizes it (e.g., hyperparameter optimization) and

prepares it for operation.

As shown in Figure 1, the model training process

obtains labeled data from the data engineering process

and sends the trained model to the model tuning stage

in a dedicated data ﬂow. The model tuning process

delivers the ﬁnal model (i.e., model deployment data

ﬂow) to be deployed to the operational setting.

Similar to data engineering, the model develop-

ment step of the ML life cycle can be more de-

tailed with model repositories, version control or data

visualization processes; however, we prefer a more

coarse-grained approach for simplicity.

Performance Monitoring. The operational stage of

the ML life cycle is represented by two DFD ele-

ments, a process for performance monitoring and a

data store for the model repository, as shown in Figure

1. This process checks the model’s performance peri-

odically. The models deployed to the operational sys-

tems are stored in the model repository with relevant

metadata (e.g., model versions). The queries of the

central healthcare system regarding the decisions of

the ML model are directed to the model repository in

a dedicated data ﬂow. The prediction data ﬂow returns

the model decisions to the central healthcare system

process. The performance monitoring process period-

ically queries the model repository in the maintenance

query data ﬂow and retrieves maintenance prediction

in a separate data ﬂow. As checking the model perfor-

mance can be realized by reference labeled data, this

process retrieves it in a data ﬂow (i.e., performance

validation data) from the data engineering process.

4.2 Security Boundaries

Our DFD is divided into six security boundaries.

Three of them are dedicated to ML-based assets. We

created one boundary for data engineering, one for

model development, and one for model operation, as

demonstrated in Figure 1.

Model development boundary comprises two pro-

cesses: model training and model tuning. We assume

that the data ﬂow, trained model, that occurs between

these two processes can be protected by organiza-

tional procedures. Thus, it stays within the security

boundary, meaning threat elicitation for that DFD el-

ement is not required. However, we consider that data

ﬂow that transfers labeled data from data engineering

to model training processes should not be trusted as

different teams in an organizational setting can handle

them. More importantly, attackers may pay particular

interest to the integrity of labeled data to launch vari-

ous attacks, including poisoning.

HEALTHINF 2025 - 18th International Conference on Health Informatics

324

Another boundary is drawn for the assets of model

operation, namely the performance monitoring pro-

cess and model repository data store. We contemplate

that the ﬁnal model (i.e., model deployment in Fig-

ure 1) obtained from the model tuning process should

be checked as the attackers may aim to manipulate or

copy that model before it reaches the operational sys-

tem. It is possible that the central healthcare system

process can be compromised as it resides at the attack

surface facing end users. Thus, the data ﬂow between

that process and the model repository should be se-

cured. However, the data ﬂows, namely maintenance

query and prediction, between performance monitor-

ing and mode registry, stay within the security bound-

ary, and there is no need for threat elicitation for them

as long as the performance monitoring process and

model registry are protected. The data ﬂow, named

performance validation data, should be secured be-

tween the data engineering and performance moni-

toring process. The attackers may aim to disturb the

performance monitoring function to reduce the oper-

ational beneﬁt.

4.3 Threat Elicitation

By analyzing ML-based system assets, we identiﬁed

18 threats: 10 for data ﬂows, three for data stores, and

ﬁve for processes. Eight of these threats are in the ad-

versarial category, whereas 10 are conventional ones.

Table 2 presents a selected subset of those threats.

Due to the page limit, we could not present all threats

in this paper. This subsection explains how we elicit

threats for different DFD elements, one data ﬂow, one

data store and one process, regarding ML-based sys-

tem assets.

Data Flow. Based on the mapping between DFD el-

ements and threat categories (see Figure 1), data ﬂow

is subject to tampering, information disclosure and

denial of service. We selected the data ﬂow named

query between the central healthcare system process

and model registry. The purpose of the query data

ﬂow is to send patient data to the model to get predic-

tions about a health decision.

Tampering in this stage could lead to input manip-

ulation attacks, also called evasion attacks, in which

the attacker changes the inputs with small perturba-

tions to cause the model to make wrong predictions.

Sundas et al. presented various ways in which input

manipulation can lead to the accuracy downfall of the

model in the healthcare domain (Sundas and Badotra,

2022). Adversarial examples are the ﬁrst risk listed

in the top ten ML security issues provided by BIML

(Gary McGraw, 2020). Rahman et al. successfully

conducted evasion attacks for COVID-19 deep learn-

ing systems in medical IoT devices (Rahman et al.,

2021).

As this data ﬂow carries sensitive patient data, the

attacker can eavesdrop on the communication channel

using attack techniques (e.g., snifﬁng the network),

indicating an information disclosure threat. It is im-

portant to note that although the realization of this

threat is a signiﬁcant achievement for an attacker who

aims to steal patient data, it can be also a precondi-

tion for an evasion attack as the attacker may need

to collect some queries and predictions to create a

shadow model and use it to generate samples to evade

the model.

An attacker may conduct a denial-of-service at-

tack by overwhelming the capacity of the model

repository with unnecessary queries and, thus, dis-

rupting critical medical decision-making (Ruan et al.,

2024). Depending on the network access control im-

plemented at the model operation boundary, spooﬁng

the central healthcare system process may be a pre-

condition for this threat.

Information disclosure and denial of service

threats identiﬁed in this data ﬂow can be categorized

as conventional, whereas evasion threats are typical

ML-based threats.

Data Store. Data store elements are susceptible to

tampering, repudiation, information disclosure and

denial of service (see Figure 1). Our DFD has one

data store, model registry, which can be categorized

as an ML-based asset.

The tampering threat may apply when the attacker

can replace the model with a malicious one by bypass-

ing the authorization function of the model repository.

This threat can cause severe consequences, ranging

from performance degradation to intentionally mis-

guiding medical decision-making (e.g., changing the

diagnosis from critical disease to normal). The at-

tacker could change the functionalities of the existing

one, thus conducting model reprogramming. Model

reprogramming involves altering healthcare AI/ML

models to produce incorrect or biased outputs. Au-

thorization problems can also lead to information dis-

closure threats when they enable the attacker to steal

the model, causing signiﬁcant intellectual property vi-

olations.

Process. A process is susceptible to spooﬁng, tam-

pering, repudiation, information disclosure, denial of

service and elevation of privileges. Here, we address

the data engineering process to demonstrate the elic-

itation for this DFD element type. Spooﬁng the data

engineering process may enable attackers to replace

Holistic Cyber Threat Modeling for Machine Learning-Based Systems: A Case Study in Healthcare

325

the labeled data sent to the model training process,

a type of data poisoning attack. A similar situation

is with tampering. If the attackers can access the

data engineering process, they can carry out model

reprogramming attacks that result in a malfunction-

ing AI/ML model. Microsoft also introduced the neu-

ral net reprogramming threat in their AI/ML-speciﬁc

threats report (Andrew Marshall, 2022).

5 THE VALIDATION OF THE

THREAT MODEL RESULTS

We followed a qualitative validation approach to val-

idate our results due to the lack of ground-truth data

about all relevant threats. A semi-structured survey

was sent to four experts (represented as X1-4) with

cyber security and ML backgrounds. The experts,

on average, with more than ﬁve years of professional

experience, were identiﬁed from the authors’ profes-

sional social networks.

All experts agreed that the design choices in DFD

make sense in general. They also agreed that us-

ing a data store element for the ”Model Registry” is

valid. EX3 suggested splitting the model registry for

the deployed model and the registry for older mod-

els and metadata into two separate elements. The ex-

perts also agreed to represent the model development

phase and boundaries. All experts agreed with the

threats’ relevancy but provided additional comments.

EX1 brought out that, ideally, the threats would have

an impact tied to them as well. Our scope is only

threat identiﬁcation, not ranking, so we will consider

this feedback in future studies. EX1 mentioned that

more privilege escalation, human errors, rogue em-

ployees and physical attacks could be mentioned, and

EX2 pointed out insider threats. EX3 also mentioned

backdoor attacks. While these threats are present in

the system, they are not ML-speciﬁc, so they are not

identiﬁed in the analysis. However, privilege escala-

tion and insider threats can be reconsidered for ML-

based assets. The backdoor attacks can also target

some assets in the model development stage.

6 DISCUSSION

In this paper, we conduct threat modeling of a system

with in-house ML model development capability. Al-

though this option comprehensively covers ML-based

system assets, other options, such as COTS products

with ML models or cloud ML services, may have

different cyber threats worth exploring in dedicated

studies. Threats originating from supply chain issues

or the security of the model updates in online mode

may be considered for COTS products. Semi-trusted

cloud admins, access control weaknesses in multi-

tenant cloud environments, or availability issues due

to sharing hardware and platform infrastructures can

be analyzed for cloud ML services.

The existing studies that introduce threat model-

ing to ML-based systems have weaknesses in sys-

tem modeling. The details of ML model represen-

tations in DFDs and design choices are not elaborated

(Wilhjelm and Younis, 2020; Ali Alatwi and Moris-

set, 2022; Mauri and Damiani, 2022). ML life cy-

cles, in general, and operational steps (e.g., perfor-

mance monitoring), in particular, have not been com-

pletely captured. The design considerations about

system boundaries are not usually justiﬁed. One of

our main contributions revolves around the detailed

system modeling and discussion of security bound-

aries.

This study uses STRIDE as the baseline method

for threat modeling. It is widely known and easy

to understand. Although more complex methods are

applied in the literature (e.g., business process nota-

tions, FMAE), we contemplate that usability is a sig-

niﬁcant success factor. Threat modeling is a collab-

orative brainstorming activity between various stake-

holders (e.g., system owners, system architects and

security analysts). Instead of seeing the problem from

a method-centric lens, we suggest practitioners and

researchers see it from a process-centric perspective.

An approach that facilitates a common understanding

among people with different backgrounds and sys-

tematises the security knowledge around the target

system may give more useful results.

Accumulating and incorporating the existing se-

curity knowledge into the modeling process as a sys-

tematic knowledge base (e.g., attack taxonomies, li-

braries) would be an essential part of that process

view (Khalil et al., 2023). Although we use the con-

tent of some documents as an attack taxonomy for ad-

versarial attacks (Vassilev et al., 2024; Andrew Mar-

shall, 2022), we do not put additional effort into creat-

ing a well-developed attack knowledge base. Creating

a knowledge base is addressed in (Wilhjelm and You-

nis, 2020). We will address this issue in our future

studies.

7 CONCLUSION & FUTURE

WORK

In this paper, we conducted cyber threat modeling of a

healthcare system that collects data from remote mon-

HEALTHINF 2025 - 18th International Conference on Health Informatics

326

Table 2: Identiﬁed threats (selected samples).

Type - Conv: Conventional, Adv: Adversarial, DFD Elements - DF: Data Flow, P: Process, DS: Data Store.

Type DFD Element Identiﬁed threats Threat description

Adv Training Data

(DF)

Data poisoning Data poisoning in healthcare means intentionally

manipulating medical data to inﬂuence AI/ML

model training. This could be achieved through

MiTM or spooﬁng attacks and modiﬁcation of data

features or labels. This could result in inaccurate

diagnoses and compromised patient care (Jagiel-

ski et al., 2018; Mozaffari-Kermani and Sur-Kolay,

2015; Vassilev et al., 2024; Gary McGraw, 2020)

Conv Raw Data

(DF), Training

Data (DF),

Performance

Validation Data

(DF)

Data conﬁdentiality

threats

Data conﬁdentiality threats in healthcare involve

unauthorized access or disclosure of sensitive med-

ical data. This could be achieved by means like

packet snifﬁng. This could lead to potential pri-

vacy breaches and misuse of personal information

(Gary McGraw, 2020)

Conv Query (DF) Denial of service Denial of Service disrupts healthcare AI/ML sys-

tems by overwhelming them with requests. Spoof-

ing the source of the request could be done for

this. This can potentially disrupt critical medical

decision-making (Ruan et al., 2024)

Adv Prediction (DF) Membership infer-

ence

During membership inference, the attacker tries to

ﬁnd out if a particular record or sample was part of

the training by querying the model. Spooﬁng the

processes and sending several queries can be one

way to achieve this. This could uncover sensitive

patient data, posing risks to privacy and conﬁden-

tiality (Shokri et al., 2017; van Breugel et al., 2023)

Adv Model Registry

(DS)

Model reprogram-

ming

Model reprogramming involves altering healthcare

AI/ML models to produce incorrect or biased out-

puts. Attackers could access the model from weak

access control mechanisms in the model registry.

This could pose risks to patient safety and treat-

ment effectiveness (Andrew Marshall, 2022)

Conv Model Deploy-

ment (DS)

Model replacement Model replacement involves replacing whole

healthcare AI/ML models with malicious ones. At-

tackers could access the model from weak access

control mechanisms in the model registry. This

could pose risks to patient safety and treatment ef-

fectiveness

Adv Data Engineer-

ing (P)

Data poisoning Data poisoning in healthcare means intentionally

manipulating medical data to inﬂuence AI/ML

model training. Attackers could use backdoors,

privilege escalation, malware, or software vulner-

abilities to access the process of carrying out the

poisoning. This could result in inaccurate di-

agnoses and compromised patient care (Jagielski

et al., 2018; Mozaffari-Kermani and Sur-Kolay,

2015; Vassilev et al., 2024; Gary McGraw, 2020)

Continued on next page

Holistic Cyber Threat Modeling for Machine Learning-Based Systems: A Case Study in Healthcare

327

Table 2 – continued from previous page.

Type Data ﬂow Identiﬁed threats Threat description

Conv Data Engi-

neering (P),

Performance

Monitoring (P),

Central Health-

care System

(P)

Data conﬁdentiality

threats

Data conﬁdentiality threats in healthcare involve

unauthorized access or disclosure of sensitive med-

ical data. Attackers could use backdoors, malware

or software vulnerabilities to compromise the pro-

cess. This could lead to potential privacy breaches

and misuse of personal information (Gary Mc-

Graw, 2020)

Adv Model Training

(P), Model Tun-

ing (P)

Model reprogram-

ming

Model reprogramming involves altering health-

care AI/ML models to produce incorrect or bi-

ased outputs. Attackers could achieve access to the

model from weak access control policies and mali-

ciously ﬁne-tune the model. This could pose risks

to patient safety and treatment effectiveness (An-

drew Marshall, 2022)

itoring devices, sends it to central servers, and facili-

tates its analysis with ML models. Our particular em-

phasis is on ML-based system assets. We created a

system model covering the entire ML model develop-

ment life cycle and systematically elicited threats. In

the future, we aim to extend this study by prioritiz-

ing threats and identifying relevant countermeasures.

Privacy threat modeling of such systems is another re-

search direction.

REFERENCES

Ali Alatwi, H. and Morisset, C. (2022). Threat modeling

for machine learning-based network intrusion detec-

tion systems. In 2022 IEEE International Conference

on Big Data (Big Data), pages 4226–4235.

Andrew Marshall, J. P. (2022). Threat modeling ai/ml sys-

tems and dependencies.

Apruzzese, G., Anderson, H. S., Dambra, S., Freeman, D.,

Pierazzi, F., and Roundy, K. (2023). “real attack-

ers don’t compute gradients”: bridging the gap be-

tween adversarial ml research and practice. In 2023

IEEE Conference on Secure and Trustworthy Machine

Learning (SaTML), pages 339–364. IEEE.

Cagnazzo, M., Hertlein, M., Holz, T., and Pohlmann, N.

(2018). Threat modeling for mobile health systems. In

2018 IEEE Wireless Communications and Networking

Conference Workshops (WCNCW), pages 314–319.

Chandrasekaran, V., Chaudhuri, K., Giacomelli, I., Jha,

S., and Yan, S. (2020). Exploring connections be-

tween active learning and model extraction. In 29th

USENIX Security Symposium (USENIX Security 20),

pages 1309–1326.

Deng, M., Wuyts, K., Scandariato, R., Preneel, B., and

Joosen, W. (2011). A privacy threat analysis frame-

work: supporting the elicitation and fulﬁllment of

privacy requirements. Requirements Engineering,

16(1):3–32.

Dr. Larysa Visengeriyeva, Anja Kammer, I. B. (2023).

Mlops principles. Last accessed 12.05.2024.

Estonian Information Systems Authority (2024). Tehis-

intellekti ja masin

oppe tehnoloogia riskide ja nende

leevendamise v

oimaluste uuring.

European Union Agency for Cybersecurity (ENISA)

(2020). Artiﬁcial intelligence cybersecurity chal-

lenges.

Gary McGraw, H. F. (2020). An architectural risk analy-

sis of machine learning systems: Toward more secure

machine learning.

Holik, F., Yeng, P., and Fauzi, M. A. (2023). A compara-

tive assessment of threat identiﬁcation methods in ehr

systems. In Proceedings of the 8th International Con-

ference on Sustainable Information Engineering and

Technology, pages 529–537.

Jagielski, M., Oprea, A., and Biggio (2018). Manipulating

machine learning: Poisoning attacks and countermea-

sures for regression learning. In 2018 IEEE Sympo-

sium on Security and Privacy (SP), pages 19–35.

Kakhi, K., Alizadehsani, R., Kabir, H. D., Khosravi, A., Na-

havandi, S., and Acharya, U. R. (2022). The internet

of medical things and artiﬁcial intelligence: trends,

challenges, and opportunities. Biocybernetics and

Biomedical Engineering, 42(3):749–771.

Khalil, S. M., Bahsi, H., and Kor

otko, T. (2023). Threat

modeling of industrial control systems: A systematic

literature review. Computers & Security, page 103543.

Latif, S., Rana, R., Qadir, J., Ali, A., Imran, M., and Younis,

S. (2017). Mobile health in the developing world: Re-

view of literature and lessons from a case study. IEEE

Access, PP:1–1.

Mauri, L. and Damiani, E. (2022). Modeling threats to ai-

ml systems using stride. Sensors, 22(17).

Min, S., Lee, B., and Yoon, S. (2017). Deep learn-

ing in bioinformatics. Brieﬁngs in bioinformatics,

18(5):851–869.

Mozaffari-Kermani, M. and Sur-Kolay (2015). Systematic

poisoning attacks on and defenses for machine learn-

HEALTHINF 2025 - 18th International Conference on Health Informatics

328

ing in healthcare. IEEE Journal of Biomedical and

Health Informatics, 19(6):1893–1905.

Papernot, N., McDaniel, P., Sinha, A., and Wellman, M. P.

(2018). Sok: Security and privacy in machine learn-

ing. In 2018 IEEE European symposium on security

and privacy (EuroS&P), pages 399–414. IEEE.

Rahman, A., Hossain, M. S., Alrajeh, N. A., and Alsolami,

F. (2021). Adversarial examples—security threats to

covid-19 deep learning systems in medical iot devices.

IEEE Internet of Things Journal, 8(12):9603–9610.

Ruan, J., Liang, G., Zhao, H., Liu, G., Sun, X., Qiu, J., Xu,

Z., Wen, F., and Dong, Z. Y. (2024). Applying large

language models to power systems: Potential security

threats. IEEE Transactions on Smart Grid.

Shokri, R., Stronati, M., Song, C., and Shmatikov, V.

(2017). Membership inference attacks against ma-

chine learning models.

Shostack, A. (2014). Threat modeling: Designing for secu-

rity. John Wiley & Sons.

Sundas, A. and Badotra (2022). Recurring threats to smart

healthcare systems based on machine learning. In

2022 10th International Conference on Reliability, In-

focom Technologies and Optimization (Trends and Fu-

ture Directions) (ICRITO), pages 1–8.

Vakhter, V., Soysal, B., Schaumont, P., and Guler, U.

(2022). Threat modeling and risk analysis for minia-

turized wireless biomedical devices. IEEE Internet of

Things Journal, 9(15):13338–13352.

van Breugel, B., Sun, H., Qian, Z., and van der Schaar, M.

(2023). Membership inference attacks against syn-

thetic data through overﬁtting detection.

Vassilev, A., Oprea, A., Fordyce, A., and Andersen, H.

(2024). Adversarial machine learning: A taxonomy

and terminology of attacks and mitigations.

Wilhjelm, C. and Younis, A. A. (2020). A threat analysis

methodology for security requirements elicitation in

machine learning based systems. In 2020 IEEE 20th

International Conference on Software Quality, Relia-

bility and Security Companion (QRS-C), pages 426–

433.

Xiong, W. and Lagerstr

om, R. (2019). Threat modeling–a

systematic literature review. Computers & security,

84:53–69.

Holistic Cyber Threat Modeling for Machine Learning-Based Systems: A Case Study in Healthcare

329